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Abstract 


Patients affected by atrial fibrillation are exposed to a fivefold increased risk 
of ischemic stroke. An early detection and diagnosis of the arrhythmia would 
therefore set the course for timely intervention to prevent potentially occurring 
comorbidities. A dilation of the left atrium as well as the presence of fibrotically 
infiltrated atrial tissue are risk factors for atrial fibrillation as they provide the 
necessary substrate for the maintenance of electrical reentrant activity. Identify- 
ing fibrotic atrial cardiomyopathy and left atrial enlargement based on machine 
learning techniques applied to P waves representing the atrial activity in the 12- 
lead electrocardiogram in sinus rhythm could thus be an important means for a 
non-invasive and remote risk stratification of new-onset atrial fibrillation episodes 
and the selection of appropriate subjects for in-depth screening. 

For this purpose, it was investigated if simulated atrial electrocardiogram data 
added to a clinical training dataset of a machine learning model could contribute 
to an improved classification of the above mentioned diseases. Two virtual cohorts 
characterized by both anatomical and functional variability were compiled and 
served as a basis for generating large-scale and unbiased datasets of P waves 
with exactly known ground truth labels of the underlying pathology. In this way, 
the simulated data fulfilled the essential requirements for the development of 
a machine learning algorithm what sets them apart from clinical data usually 
not available in large numbers in evenly distributed classes and labels possibly 
affected by inadequate expert annotations. 

A shallow feature-based feedforward neural network was set up to perform 
the regression task of predicting the tissue volume fraction of left atrial fibrosis. 
Compared to training the model only on clinical data, training on a hybrid dataset 
led to a reduction of the absolute estimation error from 17.5 % fibrotic volume 
on average to 16.5 % evaluated on a clinical test set. A long short-term memory 
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network tailored at performing the binary classification task between P waves 
of healthy subjects and left atrial enlargement patients yielded an accuracy on a 
clinical test set of 0.95 when trained on a hybrid dataset, of 0.91 when trained on 
clinical data only comprising samples with 100 % label certainties and of 0.83 
when trained on clinical data including all samples independent on their label 
certainties. 

The results of the studies presented in this thesis demonstrate that electrocar- 
diogram data resulting from electrophysiological modeling and simulations on 
virtual patient cohorts, covering relevant variability aspects complying with real- 
world observations, can be a valuable data resource for improving automated atrial 
fibrillation risk stratification. In this regard, the drawbacks of clinical datasets 
for developing machine learning models can be compensated for. This ultimately 
leads to an enhanced early detection of the arrhythmia, which allows for choosing 
appropriate treatment strategies in due time and thus, reduces the risk of stroke in 
affected patients. 
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Zusammenfassung 


Patienten mit Vorhofflimmern sind einem fünffach erhöhten Risiko für einen 
ischämischen Schlaganfall ausgesetzt. Eine frühzeitige Erkennung und Diagnose 
der Arrhythmie würde ein rechtzeitiges Eingreifen ermöglichen, um möglicher- 
weise auftretende Begleiterkrankungen zu verhindern. Eine Vergrößerung des 
linken Vorhofs sowie fibrotisches Vorhofgewebe sind Risikomarker für Vorhof- 
flimmern, da sie die notwendigen Voraussetzungen für die Aufrechterhaltung 
der chaotischen elektrischen Depolarisation im Vorhof erfüllen. Mithilfe von 
Techniken des maschinellen Lernens könnten Fibrose und eine Vergrößerung 
des linken Vorhofs basierend auf P Wellen des 12-Kanal Elektrokardiogramms 
im Sinusrhythmus automatisiert identifiziert werden. Dies könnte die Basis für 
eine nicht-invasive Risikostratifizierung neu auftretender Vorhofflimmerepisoden 
bilden, um anfällige Patienten für ein präventives Screening auszuwählen. Zu 
diesem Zweck wurde untersucht, ob simulierte Vorhof-Elektrokardiogrammdaten, 
die dem klinischen Trainingssatz eines maschinellen Lernmodells hinzugefügt 
wurden, zu einer verbesserten Klassifizierung der oben genannten Krankheiten 
bei klinischen Daten beitragen könnten. Zwei virtuelle Kohorten, die durch 
anatomische und funktionelle Variabilität gekennzeichnet sind, wurden gener- 
iert und dienten als Grundlage für die Simulation großer P Wellen-Datensätze 
mit genau bestimmbaren Annotationen der zugrunde liegenden Pathologie. Auf 
diese Weise erfüllen die simulierten Daten die notwendigen Voraussetzungen 
für die Entwicklung eines Algorithmus für maschinelles Lernen, was sie von 
klinischen Daten unterscheidet, die normalerweise nicht in großer Zahl und in 
gleichmäßig verteilten Klassen vorliegen und deren Annotationen möglicherweise 
durch unzureichende Expertenannotierung beeinträchtigt sind. Für die Schätzung 
des Volumenanteils von linksatrialem fibrotischen Gewebe wurde ein merkmals- 
basiertes neuronales Netz entwickelt. Im Vergleich zum Training des Modells 
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Zusammenfassung 


mit nur klinischen Daten, führte das Training mit einem hybriden Datensatz zu 
einer Reduzierung des Fehlers von durchschnittlich 17,5 % fibrotischem Volumen 
auf 16,5 %, ausgewertet auf einem rein klinischen Testsatz. Ein Long Short-Term 
Memory Netzwerk, das für die Unterscheidung zwischen gesunden und P Wellen 
von vergrößerten linken Vorhöfen entwickelt wurde, lieferte eine Genauigkeit von 
0,95 wenn es auf einem hybriden Datensatz trainiert wurde, von 0,91 wenn es 
nur auf klinischen Daten trainiert wurde, die alle mit 100 % Sicherheit annotiert 
wurden, und von 0,83 wenn es auf einem klinischen Datensatz trainiert wurde, der 
alle Signale unabhängig von der Sicherheit der Expertenannotation enthielt. 

In Anbetracht der Ergebnisse dieser Arbeit können Elektrokardiogrammdaten, 
die aus elektrophysiologischer Modellierung und Simulationen an virtuellen Pa- 
tientenkohorten resultieren und relevante Variabilitätsaspekte abdecken, die mit 
realen Beobachtungen übereinstimmen, eine wertvolle Datenquelle zur Verbesser- 
ung der automatisierten Risikostratifizierung von Vorhofflimmern sein. Auf diese 
Weise kann den Nachteilen klinischer Datensätze für die Entwicklung von Mod- 
ellen des maschinellen Lernens entgegengewirkt werden. Dies trägt letztendlich zu 
einer frühzeitigen Erkennung der Arrhythmie bei, was eine rechtzeitige Auswahl 
geeigneter Behandlungsstrategien ermöglicht und somit das Schlaganfallrisiko 
der betroffenen Patienten verringert. 
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Introduction 


1.1 Motivation 


In addition to numerous technical advances and public debates in the last decade, 
the global COVID-19 pandemic has most recently again brought the need for and 
benefits of digitalization in healthcare and the rollout of telemedicine services 
into focus. These technologies call for artificial intelligence and machine learning 
solutions [1] supporting clinical decision making and remote patient monitoring, 
guiding therapy and predicting clinical outcome. High throughput as well as the 
ability to discern patterns by examining multivariate feature combinations and 
detecting signal or image characteristics not conceivable for a human observer are 
among the advantages of machine learning-based analysis tools in medicine. 

As a non-invasively acquirable, easily accessible, inexpensive and widely 
available system in hospitals worldwide, the 12-lead electrocardiogram (ECG) is 
a powerful diagnostic tool in clinical practice to monitor and evaluate the cardiac 
function. As such, it can be an alternative to intracardiac signals recorded during 
invasive procedures or to expensive imaging techniques requiring trained person- 
nel and entailing potential exposure to radiation. Thus, the automated analysis of 
ECG data has great potential as a preventive screening tool for cardiovascular dis- 
eases [2, 3], which are the leading cause of death in most developed countries [4]. 
In particular, patients affected by atrial fibrillation (AF), the most common sus- 
tained cardiac arrhythmia [5-9], would benefit from remote ECG monitoring 
and automated analysis software [10-12] as a considerable percentage of AF 
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patients remain asymptomatic [13]. If left untreated, silent and sub-clinical AF 
can lead to unexpected and life-threatening thromboembolic events and ischemic 
stroke [14, 15]. 

However, progress in developing ECG-based computer assisted diagnostic 
tools is hampered due to limited supply of appropriate patient datasets. High- 
quality data curation, collection and preparation is time-consuming and involves 
considerable effort and expenses preventing open data sharing across centers and 
research groups. Furthermore, patient data is highly sensitive explaining the strict 
regulations in sharing and reuse of medical datasets. In the few open [16] or gated 
access [17] ECG databases available online, data samples usually stem from only 
a limited number of sources. Thus, patient enrollment in single center studies 
may not only cause an imbalance among the classes to be distinguished by the 
machine learning model, but may also introduce a bias in both demographic, as 
for example gender [18-20], age or ethnicity [21], and technical aspects, such 
as device manufacturer or acquisition site [22]. To avoid a distorted prediction 
adversely discriminating under-represented groups and to prevent overfitting of the 
model to the given input samples, datasets employed for machine learning models 
must not only be as large, but also as diverse as possible [23]. Moreover, inter- 
and intra-observer variability clearly impairs expert signal annotation certainty 
and quality [24] impeding the training of a supervised learning algorithm from 
the outset. 

Computational models of the heart have contributed to elucidating fundamental 
cardiac disease mechanisms in various previous studies. In contrast to data-driven 
synthetic ECG generators based on generative adversarial networks [25, 26], 
mechanistic electrophysiological models underlying the simulation do not only 
allow for composing large-scale, unbiased, noise-free and reliably labeled cardiac 
signals, but also provide a well-controlled framework to identify the mechanisms 
driving and terminating cardiac arrhythmias. Recently, personalized multiscale 
modeling of cardiac electrophysiology demonstrated added clinical value for 
example in the fields of ventricular tachycardia [27-29] and AF [30-32]. To 
generate mechanistic insight that is not only applicable to a single patient but 
to entire subgroups of the population as a complementary source of evidence 
and data for automated signal analysis and disease classification, cohort based 
modeling capturing relevant anatomical, disease-specific and functional variability 
occurring in clinical practice is warranted. 


1.2. State of the Art 


In contrast to cardiac digital twins [33, 34], where a personalized model of a 
patient’s heart is replicated in silico to allow for patient-specific therapy guidance, 
cohort modeling aims at predicting risk and investigating the efficacy of differ- 
ent therapy options on a population basis and derive specific treatment criteria 
applicable to entire subgroups of the population. Covering variability in a virtual 
population is key for generating large-scale synthetic ECG datasets encompassing 
a wide range of signal variation prevailing also in clinical cohorts [35, 36]. These 
cohorts of computational models can form the basis for large-scale mechanistic 
simulations of cardiac signals in a well-controlled environment enriching and 
extending clinically recorded datasets for machine learning applications to investi- 
gate the potential of the 12-lead ECG for non-invasive, automated and remote AF 
risk stratification. 


1.2 State of the Art 


Since AF is the most frequently clinically diagnosed supraventricular tachycardia 
worldwide and is associated with an increased risk of stroke, various clinical 
studies have focused on individual risk stratification to allow for timely and appro- 
priate intervention or prevent new-onset AF episodes altogether [37-40]. Among 
them is the Apple Heart Study, where a smartwatch worn by >400,000 participants 
was meant to release a notification in case of irregular pulse occurrence [41] 
detected using photoplethysmography. Upon receiving the notification, eligible 
patients were mailed an ECG patch to be worn for up to 7 days so that trained 
physicians could inspect the long-term recordings and possibly confirm the AF 
diagnosis. During a period of about 4 months, 0.52 % of the participants received 
an irregular pulse notification. Among the patients returning the ECG patch, the 
positive predictive value of AF actually being detectable in the ECG was 0.84. 
Attia et al. [42] applied a convolutional neural network to 10 second 12-lead ECG 
traces in normal sinus rhythm and succeeded in identifying AF patients with an 
accuracy of 79.4%. Kurshid et al. [43] showed that fitting a hazard model not 
only with the clinical CHARGE-AF risk score [44, 45] but additionally with the 
5-year AF probability predicted by a convolutional neural network trained with 
clinical 12-lead ECGs yielded superior results than using clinical risk scores alone. 
Eichenlaub et al. [46] showed that patients in whom AF recurs after ablation 
therapy can be identified based on non-invasively recorded body surface poten- 
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tial maps with 91.3% sensitivity and 93.7% specificity. Thus, different previous 
studies have confirmed that machine learning enabled algorithms are suitable for 
an early identification of AF patients based on sinus rhythm ECGs of different 
clinical cohorts. To address and overcome the shortcomings of clinical ECG data 
as elaborated on in section 1.1, numerous simulation studies have been carried out 
predominantly for optimizing AF therapy planning. 

Regarding AF treatment by ablation, Luongo et al. [47] showed that a pre- 
diction of acute pulmonary vein isolation success based on the ECG is possible 
with a specificity of 82% and a sensitivity of 73% on a clinical test set of 46 AF 
patients. Importantly, this classifier was trained exclusively on synthetic data 
(1,128 ECGs) and tested on clinical data [47]. By drawing on simulations, the vul- 
nerability of the atria for developing new-onset AF episodes can be evaluated [48]. 
Thereafter, the suitability of pharmaceutical agents [49, 50] or virtual ablation 
lines [30, 31, 51] regarding termination of rotational activity can be assessed using 
computational models. However, a major challenge in previous simulation studies 
focusing on therapy planning for entire subpopulations of patients usually lies in 
the generation of large populations of atrial models. 

To create a patient-specific computational atrial model, a geometrical repre- 
sentation of the atria obtained from tomographic imaging or electroanatomical 
mapping studies serves as a basis to replicate a patient's heart in silico [52]. The 
process of segmenting tomographic images and building a simulation-ready atrial 
geometrical model from the resulting endocardial wall, requires time and rule- 
based augmentation tools [53]. Thus, most of the simulation studies mentioned 
before were conducted by employing only a limited number of patient-specific 
geometries. This is acknowledged as a limitation in these publications, as observ- 
ing only few patients might not be sufficient to thoroughly represent anatomical 
variability of the atria across a population. However, large populations of ionic 
models have been compiled in previous studies by varying conductances of several 
ion channels in the cell model to represent the biological variability observed in 
experimental studies [54]. These populations of models have been used to study 
drug-induced effects on different action potential and ionic current biomarkers to 
predict either the risk of cardiotoxicity in the field of safety pharmacology [55- 
58] or to determine the efficacy of different channel blockers and their doses on 
anti-arrhythmic properties in the field of precision medicine [50, 59-63]. It has 
been shown that these in silico drug testing methods can predict cardiotoxicity 
more accurately than animal trials [55]. 
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1.3. Research Question and Hypothesis 


These populations of cell models aim at capturing the cellular electrophysio- 
logical variability. To cover the aspect of cardiac shape and anatomical variability 
in cohorts of computational models, statistical shape models (SSMs) provide 
the basis to generate a large variety of geometrical models representing atrial 
anatomy and its clinically observed statistical variability [64]. SSMs of the human 
heart have previously been constructed for the purpose of active shape modeling 
segmentation approaches or to investigate the suitability of left atrial shape scores 
as a predictor for AF [65, 66]. 

In consequence, addressing the disadvantages of clinically recorded ECG 
data for AF risk prediction by drawing on simulations is a promising approach. 
However, the lack of comprehensive and calibrated datasets of atrial multiscale 
computational models bars the way to progress for extensive atrial ECG simu- 
lations. Based on these aspects, the aim of this thesis is defined as outlined in 
section 1.3. 


1.3 Research Question and Hypothesis 


The aim of this thesis is to simulate and validate large-scale synthetic ECG datasets 
covering various types of underlying model variability applicable as an extension 
to clinical input data for machine learning algorithms and in this way counteract 
the shortcomings of clinically recorded patient data. Thereby, a particular focus is 
set on the application of machine learning techniques for detecting risk markers 
for AF from 12-lead ECGs in normal sinus rhythm. Specifically, the following 
research question and hypothesis are aimed to be answered and put to test: 
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Research Question 


Can modeling and simulation applied to populations of 
multiscale atrial models covering various types of vari- 
ability contribute to overcome the lack of high quality, 
unbiased, well-controlled and large-scale clinical ECG 
datasets to identify AF susceptible patients? 


Hypothesis 


It is hypothesized that the performance of machine learn- 
ing algorithms trained to identify patients with fibrotic 
atrial substrate and enlarged left atrial volumes improves 
when adding simulated atrial ECGs to the input dataset 
compared to using clinical data only during training. 


The following methodological milestones relevant to answer the research questions 
presented above are addressed and implemented in this thesis: 


Assessing fidelity of simplified simulation models by evaluating the accu- 
racy of resulting action potential (AP) and ECG signals as well as local 
activation times (LATs) compared to the simulation results obtained from 
the gold standard methods to speed up the generation of large, yet physio- 
logically accurate, simulated ECG datasets. 

Development and evaluation of a bi-atrial SSM as a basis to compile a wide 
range of anatomical models applicable to electrophysiological simulations. 
Calibration of a large population of multiscale atrial computational models 
comprising anatomical and functional variability to ensure realistic and 
representative phenotypic variability of simulated ECGs comparable to 
clinical data variability. 

Generation of a simulated ECG dataset comprising 10,000 signals of 10 
seconds length representing a healthy cohort and selected pathologies. 
Application of simulated atrial ECGs as an additional input source to ma- 
chine learning algorithms to identify patients with fibrotic atrial substrate 
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and enlarged left atrial volumes at a higher accuracy than by using clinical 
data only for training the classifier. 


Summarizing, the research presented in this thesis is meant to serve as ground- 
work to support and contribute to the vision of an early and non-invasive identifi- 
cation of asymptomatic AF patients via remote monitoring, selecting susceptible 
patients for in-depth AF screening and in this way reduce their risk of stroke 
through timely intervention. 


14 Structure of the Thesis 


The thesis is structured as described below. 


Part I outlines selected fundamentals relevant to the analysis in this work and the 
resulting findings. 

° In Chapter 2, the underlying physiological principles of the cardiac elec- 
trical excitation system and the arrhythmogenesis of atrial fibrillation are 
described. 

e Chapter 3 summarizes the mathematical underpinnings of electrophysi- 
ological cardiac modeling and simulation, statistical shape modeling and 
machine learning. 


In Part II, the drawback of biophysically detailed simulation approaches being 
demanding in terms of computational resources to solve the model equations 
is addressed. Especially for large-scale simulations of atrial electrophysiology, 
as they should be conducted for the studies presented in this thesis, a fast and 
accurate simulation framework is indespensable. 
° In Chapter 4, the applicability of computationally less complex though also 
biophysically less detailed simulation models is investigated by comparing 
the simulation results between simplified and gold standard methods. 


Part III describes the generation of computational atrial model cohorts. 
e Chapter 5 details the construction and evaluation of an SSM of the human 
atria. In this way, the limitation of small sample sizes of atrial geometries in 
previous simulation studies is addressed which led to an insufficient capture 
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of anatomical variability and of resulting ECG morpholosy in these virtual 
multiscale model cohorts. 

In Chapter 6, the generation and calibration of a population of multiscale 
atrial models built based on geometrical instances of the SSM combined 
with electrophysiological variability on tissue and cellular scale are de- 
scribed. 

In Chapter 7, the generation of 10,000 12-lead ECGs is shown. These sig- 
nals comprise not only the atrial activity of a single heartbeat but represent 
instead a complete 10 second time series of simulated cardiac signals for a 
healthy control group and selected pathologies. In this way, the synthetic 
signals are of a comparable format to actual clinical recordings without 
the need for preceding waveform delineation or feature extraction and thus 
provide the necessary input for a potential application of deep learning 
methods based on multi-dimensional time series signals. 


Part IV refers to the aim of this thesis to identify patients at high risk for devel- 
oping new-onset AF episodes supported by simulated ECG signals. Thus, the 
generated model populations were applied for large-scale simulations of atrial ac- 
tivity in sinus rhythm, optionally considering anatomical, electrical and structural 
remodeling typically occurring in AF patients. Machine learning techniques are 
applied to distinguish between healthy individuals and patients with fibrotic atrial 


cardiomyopathy or left atrial enlargement, both of which are reported to promote 
and contribute to the arrhythmogenesis of AF. 


In Chapter 8, P wave features were extracted from simulated and clini- 
cal ECG datasets comprising healthy subjects and patients with fibrotic 
substrate patches in the atria. These were subsequently used as input to a 
shallow neural network tailored at detecting fibrotic atrial cardiomyopathy. 
In Chapter 9, the signals representing a healthy and a left atrial enlargement 
population were fed into a long short-term memory network to distinguish 
between the two cohorts. The study design was again translational to 
identify the added value that simulated data can carry when extending 
clinically recorded ECGs as input for machine learning techniques. 


An outlook for future projects and a general conclusion drawn from the findings of 
the research leading to this thesis are outlined in Chapter 10 and 11, respectively. 


PARTI 


FUNDAMENTALS 


Chapter 2 


Medical Fundamentals 


2.1 Cardiac Anatomy and Physiology 


The human heart is located behind the sternum in the chest and consists of four 
chambers: The left and the right atria are connected through the mitral and 
tricuspid valve to the left and right ventricles below. The atrial and ventricular 
septal wall separate the left from the right cardiac chambers (see Figure 2.1). [67- 
69] 

The cardiac cycle starts with ventricular diastole. The left and the right atrium 
receive blood through the pulmonary veins from the lungs and mostly through 
the venae cavae from the body periphery, respectively. The atria contract and 
pump blood through the valves into the ventricles. During ventricular systole, the 
ventricles contract. Oxygenated blood from the left ventricle is ejected through 
the aorta branching into different arteries connecting the heart to other organs 
and systems in the human body. Right ventricular contraction initiates the flow 
of oxygen-depleted blood through the pulmonary valve to the lungs where it is 
reoxygenated. Thereafter, the ventricles relax again and the next cardiac cycle 
starts. [67-69] 
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Bachmann’s bundle 


\ Left atrium 


Right atrium 


Tricuspid valve 


Left ventricle 
Right ventricle 


Figure 2.1: Anatomy of the human heart. 


2.2 Electrical Excitation System 


The contraction of the heart is caused and preceded by its electrical activation, 
which can be measured non-invasively on the body surface via electrocardiography 
(see Figure 2.2). For the acquisition of a 12-lead electrocardiogram (ECG), 9 
electrodes are placed on the left and right arm (LA, RA), the left leg (LL) and 
along the rib cage (v1-v6). In this way, the ECG in 12 channels representing 
different axes of excitation propagation can be calculated as the difference in 
potential recorded at the electrode positions as: 


I=LA-RA 
I =LL-RA 
II=LL-LA 


aVR =RA — (LA +LL)/2 
aVL =LA — (RA +LL)/2 
aVF =LL — (LA +RA)/2 
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Figure 2.2: Genesis of the electrocardiogram. 
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The excitation origin of the electrical activation in the heart is triggered by the 
sinoatrial node located in the right atrium. It automatically releases electrical 
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stimuli at a pace determining the heart rate in physiological sinus rhythm scenar- 
los [70]. The preferred direction of activation in the right atrium is defined by 
rapid and anisotropic conduction via crista terminalis and the pectinate muscles 
on the posterior wall and Bachmann’s bundle on the anterior wall. As both atria 
are electrically isolated from one another at the septal wall, the activation of the 
left atrium in the healthy sinus rhythm case takes place via interatrial connections, 
such as Bachmann’s bundle on the anterior wall or posterior bridges and the 
coronary sinus bridge. The activation of both atria is represented by the P wave 
in the ECG. Following the atrial activation, the depolarization wavefront is de- 
layed at the atrioventricular node, which reflects in the PQ segment in the ECG. 
Subsequently, the ventricles are activated through the bundle of His conducting 
electrical impulses to the terminal branches of the Purkinje fibers from where the 
excitation transfers to the ventricular myocardial tissue causing the QRS complex 
in the ECG. Atrial repolarization takes place simulateneously to ventricular acti- 
vation and thus, a supposedly occurring T, wave does not show in the ECG as it 
is burried within the QRS complex. Finally, the repolarization of the ventricles 
cause an upright and concordant T wave in the ECG as tissue regions in close 
proximity to the ventricular apex start to repolarize first before the repolarization 
wavefront traverses the ventricular tissue mainly in apicobasal direction towards 
the valves. [71, 72] 

The electrophysiological principles underlying the activation initiation and 
propagation as well as the formation of cardiac electrical sources measurable on 
the body surface can be explained on a cellular level. If a cell in the myocardial 
tissue is activated, an action potential is triggered. The latter describes the time 
course of the difference in potential across the cell membrane between the intra- 
and extracellular space of a single cell arising due to differences in ionic concen- 
trations in both domains [71, 72]. The typical signal course of an atrial action 
potential is depicted in Figure 2.3. 

In phase 4, the cell is not yet depolarized and a resting membrane potential of 
approximately -80 mV prevails [73]. When triggered, e.g., by an action potential 
of a neighboring cell, fast Na* channels open causing an inflow of sodium to 
the intracellular space and consequently, an increase in transmembrane voltage 
in phase 0. The peak conductance of the Nat channels coincides with the time 
step of the steepest action potential upstroke denoting the activation time of 
the cardiomyocyte. The peak of the action potential in phase 1 is marked by a 
transmembrane voltage of approximately +20 mV and Na channels are inhibited. 
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Figure 2.3: Time course of an atrial action potential. Phase 0-4 upstroke, peak, plateau phase, 
repolarization, and resting membrane potential, respectively. APD = action potential duration, 
RP = refractory periord 


In response to the change in transmembrane voltage, K+ and Ca2+ channels 
open. Potassium outward and calcium inward currents electrically balance each 
other which reflects in the plateau phase of constant voltage in phase 2. As the 
potassium outward current starts to dominate in phase 3, the transmembrane 
voltage declines again during the repolarization phase. During the refractory 
period (see Figure 2.3), N a* channels remain closed, sodium cannot leak in 
the intracellular space and thus, the cell would not respond with the initiation 
of a new action potential when triggered in this interval. The refractory period 
of a cardiomyocyte is a protective mechanisms to prevent rapid and recurring 
depolarization of the the same cardiac tissue regions. The decline in voltage in the 
repolarization phase lasts until the resting membrane potential of approximately 
-80 mV is reached again marking the offset of the action potential duration (APD). 
Sodium-calcium exchangers and sodium-potassium pumps ensure a restoration of 
the initial ion distribution between the intra- and extracellular space so that a new 
action potential can be triggered in the subsequent cardiac cycle. [71] 
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Moving from single cell to the excitation of an entire tissue patch, the action 
potential is propagated via gap junctions connecting adjacent cells and thus cause 
the electrical depolarization wave to advance across the myocardial tissue. The 
speed the activation wavefront traverses the tissue with is commonly referred to 
as conduction velocity (CV). [71] 


23 Atrial Fibrillation 


Atrial fibrillation (AF) is the most frequently encountered cardiac arrhythmia and 
affects 2-3% of the population in Europe and North America [5]. Advanced age 
is one of the risk factors for developing new-onset AF episodes. Therefore, even 
higher AF prevalence rates are expected for the years to come due to the ongoing 
aging of the society in industrialized countries. In contrast to normal sinus rhythm, 
AF episodes are characterized by disorganized reentrant waves traversing the atrial 
tissue. The maintenance of the arrhythmia requires either rapid foci firing or a 
remodeled substrate supporting the reentry, for example due to fibrotic infiltrations 
that alter conduction and cellular electrophysiology facilitating the formation of 
reentry circuits. Due to hemodynamic impairment and blood stasis induced by the 
electrical disorganization, AF favors the formation of blood clots and therefore in- 
creases the risk of ischemic stroke for individual patients as well as morbidity and 
mortality rates on a population level. Consequently, the development of optimal 
treatment strategies is crucial to not only reduce the enormous financial burden 
for the public health system of 45 billion € per year in the European Union [74] 
but also to improve the quality of life for affected patients by achieving long-term 
freedom from arrhythmia recurrence. Anti-arrhythmic drug therapy is one of the 
treatment options for AF to either control the heart rate in rate control strategies or 
to restore and maintain sinus rhythm in rhythm control strategies [75]. However, 
a patient-specific selection of the drug as well as a personalized adjustment of 
the optimal drug dose is likely necessary to maximize therapeutic yield. An 
alternative first-line therapy or follow-up treatment option for drug-resistant AF 
patients is catheter ablation [75]. During the minimally invasive procedure, scars 
are created at selected locations on the myocardial tissue, which aim at blocking 
pathological excitation pathways. To treat substrate-based arrhythmia in persistent 
AF patients, a mapping procedure precedes the intervention to localize the origin 
of the pathological excitation [76, 77]. However, neither therapeutic option can 
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provide permanent cure for AF as recurrence rates of 30-60% prevail [78]. Recent 
work showed that a patient-specific ablation approach by targeting atrial areas of 
high spatiotemporal dispersion in the electrograms arising due to locally reduced 
conduction velocities can reduce arrhythmia recurrence rates to 14% whereas the 
patient group undergoing pulmonary vein isolation alone was characterized by 
41% recurrence [79]. Moreover, Jadidi et al. [80] found that low voltage guided 
ablation procedures can help to identify additional ablation targets for arrhythmia 
termination in persistent AF patients. 

Structural and anatomical remodeling of the atria, as for example fibrotic 
substrate or enlargement of the atrial chambers, frequently occurs together with 
AF and is commonly referred to as atrial cardiomyopathy. Although it is still 
under debate whether remodeling is a cause or a consequence of AF, the presence 
of atrial fibrosis and dilation of the left atrium are known to set the stage for the 
development of reentrant activity as will be explained in section 2.3.1 and 2.3.2. 


2.3.1 Fibrotic Atrial Cardiomyopathy 


Arrhythmogenic fibrotic atrial cardiomyopathy (FAM) manifests in the presence 
of fibrotic substrate replacing and interfering with healthy myocardial tissue in 
the atria. The proliferation and activation of fibroblasts leads to an accumulation 
of collagen and other extracellular matrix proteins in the interstitial space and 
cause neighboring cells to be electrically isolated from one another impeding 
transverse wave propagation [81]. Furthermore, down-regulation of gap junctions 
entails conduction slowing in fiber direction. Generally, the resulting reduced 
conduction velocity facilitates the formation of reentry circuits in AF patients. 
This is because the time for a wavefront reaching excitable tissue regions after 
traversing an anatomical or functional obstacle is increased. This comes along 
with an increased likelihood that firstly, the effective refractory period of the cells 
in the vulnerable region has already passed, secondly, a new action potential 
in these cells can be triggered and finally, a reentry on a macroscopic level can 
occur. Ionic remodeling in fibrotic regions also contributes to promoting AF as a 
reduction of Gk, and Gca conductances in fibrotically infiltrated regions cause 
a shortened refractory period [82]. The inhomogeneities in refractory period as 
well as fibroblast-myocyte coupling can also lead to uni-directional conduction 
block increasing reentry vulnerability. 
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State-of-the-art methods for identifying fibrosis in the atria comprise elec- 
troanatomical voltage mapping and late Gadolinium enhancement magnetic reso- 
nance imaging (LGE-MRI). The former is acquired during a minimally invasive 
procedure preceding catheter ablation. Applying a threshold of 0.5 mV for bipolar 
electrograms recorded in sinus rhythm allows for the identification of fibrotic 
areas [83, 84] serving as potential additional ablation targets in recurrent AF after 
pulmonary vein isolation. LGE-MRI is an imaging technique with Gadolinium- 
based contrast agents requiring expensive equipment and trained personnel for seg- 
menting areas of high intensity image ratios to be identified as fibrosis. However, 
the spatial location of fibrosis found based on LGE-MRI and electroanatomical 
mapping is reported to differ from one another. Furthermore, there is no clear 
consensus in the community on the efficacy of choosing areas of anchoring rotors 
as additional targets for catheter ablation up to date [31, 80, 85]. 


2.3.2 Left Atrial Enlargement 


Structural remodeling in AF patients comprise among others an enlargement of the 
left atrium occurring as a consequence of chronic pressure overload. An enlarge- 
ment of the left atrial chamber leads to an increased path length of the reentrant 
wave in the atria resulting in an increased likelihood that a reentrant wavefront 
hits excited and non-refractory tissue in the critical temporal window of AF vul- 
nerability [86, 87]. Thus, left atrial enlargement (LAE) can be an independent 
predictor for AF [65] which is why an early detection of LAE could contribute 
to effectively select asymptomatic individuals for in-depth screening and thus 
mitigate or prevent severe disease progression. Trans-thoracic echocardiography 
is acommon diagnostic tool to identify LAE patients. Chamber quantification 
guidelines recommend a threshold value of 34 ml/m2 for the atrial volume indexed 
to the body surface area of the patient for a standardized diagnosis of LAE [88]. 
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Mathematical Fundamentals 


3.1  Multi-scale Electrophysiological 
Modeling 


Computational cardiac modeling and simulation carry the potential of providing 
profound mechanistic insights into the properties underlying cardiac excitation 
initiation and propagation [89, 90]. As such, they can contribute to overcome 
the lack of experimental and clinical signals of human cardiac action potentials, 
electrograms and electrocardiograms (ECGs) as clinical data collection requires 
trained personnel, ideally extensive patient enrollment and entails cumbersome 
and time-consuming laboratory work. 

Applying computational models under healthy and diseased conditions to elec- 
trophysiological simulations has been shown to reveal insights into the underlying 
principles of cardiac (dys-)function and provide a framework for developing new 
therapy options on multiple levels (see Figure 3.1): On cellular scale, cell models 
are built to replicate ionic current flow in silico leading to the action potential 
of a cardiomyocyte and thus, can improve the understanding of mechanisms 
underlying cellular pathophysiology and the impact of pharmacological agents 
on them. The mathematical underpinnings in these cell models are described in 
section 3.1.1. 

On tissue scale, the spread of the electrical depolarization wave on the car- 
diac tissue as well as the genesis of electrograms can be studied by employing 
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Figure 3.1: Bottom-up design of multi-scale cardiac electrophysiological modeling. 


propagation models (see section 3.1.2). In this way, various disease phenomena, 
such as arrhythmia vulnerability or rotor dynamics, could potentially be traced 
back to electrophysiological properties of the heart in the computational model. 
Moreover, the impact of ablation lines or pharmacological treatment on arrhythmia 
termination and recurrence can be investigated on tissue level. 

By solving the forward problem of electrocardiography (see section 3.1.3), the 
electrical source distribution in the heart can be projected onto the body surface. 
In this way, body surface potential maps are obtained and from them, ECGs can 
be extracted. These provide the basis for systematically and reliably investigating 
the impact of possible underlying cardiac abnormalities on the ECG and in turn, 
stratifying arrhythmia risk by only employing non-invasively measurable signals 
on the body surface. 


3.1.1 Cell Models 


In 1952, Hodgkin and Huxley developed the first mathematical model describing 
the electrical behaviour of a cell. Since then, various different cellular models of 
atrial, ventricular, or human-induced pluripotent stem cell-derived cardiomyocytes 
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have been developed. An established model for atrial electrophysiology is the 
Courtemanche et al. cell model [91]. With nonlinear-coupled ordinary differential 
equations, ionic currents and the transmembrane voltage can be calculated by 
considering different ion channels, transporters and pumps. The time dependent 
course of the transmembrane voltage V,, is calculated as 


dV,„ = — (Lion + Istim) 
dt Cin 


(3.1) 


where Cm denotes the membrane capacity and Istim a stimulus current applied 
externally. The ionic current across the membrane Tion is calculated as the sum 
of twelve single ionic currents considered in the model. Each of them can be 
described by Ohm’s law and can thus be calculated as the product of the ion 
channel conductivity and an ion specific voltage multiplied by gating variables. 
The latter are introduced to describe the open probability of an ionic channel in 
the cell membrane defining whether a specific ion can move from the extra- to the 
intracellular space or vice versa. 

By solving the ordinary differential equations, the time course of an action 
potential in response to an externally applied stimulus can be calculated. 


3.1.2 Propagation Models 


Propagation models can be employed to calculate the spatio-temporal spread of 
the electrical depolarization wavefront on the cardiac tissue either in form of trans- 
membrane voltages V,,, and extracellular potential fields P, (see section 3.1.2.1 
and section 3.1.2.2) or local activation times (see section 3.1.2.3). The optimal 
choice of which propagation model to apply depends on the problem to be solved. 
In Table 3.1, simplifications and inaccuracies of different methods are listed. Even 
though the bidomain model is biophysically the most accurate description, is it 
computationally notably more expensive to solve compared to simplified model 
solutions. Various assumptions summarized in Table 3.1 are necessary for the 
latter to reduce computational cost and speed up simulations. Independent on 
the choice of the propagation model, the domain has to be discretized in space 
to numerically solve the model equations, for which different schemes, like for 
example finite differences, finite elements or finite volumes [92], are applicable. 
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3.1. Multi-scale Electrophysiological Modeling 


3.1.2.1 Bidomain Model 


The bidomain model presents the biophysically most detailed and accurate de- 
scription of cardiac excitation propagation on tissue level to date [93-95]. By 
solving the system of partial differential equations 3.2 and 3.3, the potentials in 
the coupled intra- and extracellular space ®; and ®, can be calculated: 


-V:((6;+0,)V®,) = V : (o; VV,,) (3.2) 
V . (o;VV,,) + V . (o;V@,) = B (ne + Lion -1.) (3.3) 


In the equation system above, o; and o, denote the conductivity tensors in the 
intra- and extracellular domain, B the surface-to-volume ratio of the cell, Cm the 
membrane capacitance. Jip, and I, describe the sum of all ionic and externally 
applied stimulus current densities across the membrane, respectively. V, is the 
difference in potential between the intra- and extracellular space (equation 3.4), 
and is thus defined as the transmembrane voltage: 


Vin =P; — P. (3.4) 


Solving the partial differential equation system requires a specification of 
boundary and initial conditions [96]. Homogeneous Neumann boundary condi- 
tions can be imposed for the tissue-bath interface in the intra- and extracellular 
domain. No flux is assumed for the normal current at the tissue-bath interface in 
the intracellular domain: 


n-o;-V®;=0 (3.5) 


In the extracellular domain, no flux is also enforced for ®, at the boundary of 
the torso not in contact with myocardial tissue: 


n: o, Vb, = 0 (3.6) 


whereby op denotes the conductivity of the specific tissue type considered 
in a simulation setup of a heterogeneous torso volume conductor. Furthermore, 
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continuity of the potential in the interstitial space (equation 3.7) as well as of 
the normal component of the extracellular current at the myocardial tissue-bath 
interface (equation 3.8) are enforced: 


Pele = $,|, (3.7) 
Nn:0.:V®,=n:0,:V®, (3.8) 


State variables of the cell model described in section 3.1.1 paced to a limit 
cycle at 1 Hz account for the initial conditions of the model. 


3.1.2.2 Monodomain Model 


The monodomain model constitutes a simplification of the bidomain formula- 
tion [94, 95, 97, 98]. Under the assumption of equal anisotropy ratios in the intra- 
and extracellular domain, the bidomain model can be reduced to the following 
non-linear partial differential equation: 


OVin 
V 2 (G, VV,,) — p (ton +Cn i =) (3.9) 


dt 
Om denotes the monodomain conductivity, which can be calculated as half of the 
harmonic mean between intra- and extracellular conductivity: 


Ca (3.10) 


In the monodomain model, only the currents in the intracellular space and through 
gap junctions are considered. Thus, the torso volume conductor does not have an 
impact on the transmembrane voltage, which leads to bath loading effects being 
ignored. However, an approximation of the extracellular potential ®, can still 
be obtained by a temporally infrequent solve of the computationally expensive 
elliptical bidomain equation 3.2 [99]. 

When applying either of the mono- or bidomain model, strict requirements 
on the mesh resolution need to be fulfilled to avoid the effect of a coarse mesh 
slowing down conduction velocity or even breaking down wave propagation [100]. 
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3.1.2.3 Eikonal Model 


By solving the Eikonal equation, the activation time T for each node x in the 
spatial domain of the cardiac model can be computed as 


JVT(x)'-M-VT(x) =1 (3.11) 


where M is the squared conduction velocity (CV) tensor. Defining a set of trigger 
points x € [` where excitation is initiated is required to set up the initial conditions 
to solve equation 3.11: 


T(x) = Tọ for x € T (3.12) 


Methods for solving the Eikonal equation comprise for example the fast march- 
ing [101] or the fast iterative method [102]. 

Unlike in biophysically detailed excitation propagation models (see sec- 
tion 3.1.2.1 and 3.1.2.2), solving the Eikonal equation only yields local activation 
times (LATs) on each node belonging to the cardiac tissue. However, a complete 
representation of the transmembrane voltage distribution would be required to 
further compute cardiac signals, such as for example electrograms [103] or ECGs. 
Thus, the transmembrane voltage V,,(x,t) at each node x and for each timestep t 
can be infered in a postprocessing step based on the calculated LATs as follows: 


Vin(x,t) = U (x,t — T (x)) (3.13) 


whereby U(x,t) is an action potential time course. It can be obtained for example 
based on a mono- or bidomain simulation of a planar wave propagation in a tissue 
strand experiment from which the time course of V,, was extracted at a node 
located in the center of the strand mesh. 


3.1.2.4 Reaction-Eikonal Model 


In the reaction-Eikonal model, activation times are in the first place obtained by 
solving the Eikonal equation and are used subsequently to allow for the appli- 
cation of biophysical models, such as the bi- or monodomain formulation, to 
coarse meshes [104]. An Ifoot current is introduced to mimic the effect of the 
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diffusion term and is applied at the activation time obtained as the solution to the 
Eikonal equation 3.11. In the RE* variant coupled to the monodomain model, the 
respective equation (compare equation 3.9) is modified as follows: 


ƏV, 
vs (OmV Vm) +I foot s = B (tom + Cm: e) (3.14) 
In this way, a node can either be activated by the diffusion term or the {foot 
current. Furthermore, adjacent nodes also interact during the repolarization phase. 


3.1.3 Forward Calculation Methods 


By solving the forward problem of electrocardiography, the dipole sources of Vm 
in the heart obtained from simulations of excitation propagation as explained in 
section 3.1.2 can be mapped onto the body surface [105]. In this way, body surface 
potential maps can be generated or ECGs can be calculated by extracting the 
potentials at common electrode positions on the upper body. Different methods to 
approach this problem are published and are summarized in the following sections. 


3.1.3.1 Finite Element Method 


The term finite element method usually refers to solving the Poisson equation 
from the parabolic bidomain equation 3.2 using a finite element discretization 
scheme [106]. Thus, the entire torso domain must be discretized volumetrically 
with finite elements. The torso together with all organs inside except for the 
heart are then modeled as a passive volume conductor and the respective elements 
assigned conductivities Oe. In this way, equation 3.2 can be solved numerically 
and the extracellular potential field P, on the body surface can be obtained. 


3.1.3.2 Boundary Element Method 


The basic principle underlying the boundary element method (BEM) is that only 
equivalent dipole sources on the cardiac surface are considered when calculating 
body surface potentials and ECGs to reduce computational cost. To transfer a 
reduced set of dipole sources onto the surfaces bounding organs of heterogeneous 
conductivity properties, isotropic myocardial conduction properties in the ex- 
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tracellular space are assumed [106]. Applying Green’s theorem and boundary 
conditions as well as assuming equal anisotropy ratios in the intra- and extra- 
cellular space, equation 3.2 can be reformulated as follows for calculating the 
extracellular potential field ®, at any point Z on the surface S! [107]: 


20, el (Fer) > a 
& r = - P(r P(r’ ——. dš, r S 
e(r) ol +o! e (P) m ok +o% Jsk ( mer f 


(3.15) 


where o“ and ok denote the conductivities inside and outside of surface k. p° 
is the potential that would be generated if the heart was immersed in an infinite, 
homogeneous medium characterized by a conductivity of os. The dipole sources 
can either be expressed as transmembrane voltages on the cardiac surface [108] or 
volumetrically as primary impressed currents J, in the volume conductor Vp: 


l Jp: (Fr) 
o> (7) = Ë 3.16 
he 2.16) 


3.1.3.3 Infinite Volume Conductor Method 


The infinite volume conductor forward calculation method relies on the assumption 
that the heart and thus all cardiac dipole sources are immersed in an unbounded, 
homogeneous medium. Hence, the surface integral of the secondary sources 
introduced by the bounded volume conductor in equation 3.15 are neglected and 
the extracellular potentials ®, are calculated as follows: 


> 


1 J, (#— r!) 
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32 Statistical Shape Modeling 


A statistical shape model (SSM) describes the variability in shape of a geometrical 
object in a cohort of individual instances. Point distribution models constitute the 
most common subclass of SSMs and aim at parameterizing shape deformations 
by describing the spatial movement of a set of unique landmarks annotated on 
the surface of each individual instance [109]. The processing steps preceding the 
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acquisition of the coordinate vectors of these landmarks as an input applicable to 
principal component analysis (PCA) to eventually evaluate the shape statistics are 
shown along with the mathematical notation in Figure 3.2. 


Concatenating sc of all | 
N instances in 
correspondence yields 
r Lo \ 7 s the required input for 
| Initial Model Population | | Spatial Alignment | Correspondence Retrieval principal component 


Acquisition analysis: 
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Set T^ comprising N aligned || Set TŠ comprising N aligned 
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Figure 3.2: Processing steps and mathematical notation for building a statistical shape model 
(SSM). The red text and boxes indicate the changes applied to the shapes T and their vector- 
ized surface point representations s in each step. The acquisition of the initial model popula- 
tion forms the basis for building the SSM and is explained in section 3.2.1. After aligning these 
instances in space (see section 3.2.2) and establishing correspondence (see section 3.2.3), the 
shape vectors can be rearranged in a matrix notation and are applicable as an input for prin- 
cipal component analysis to evaluate the shape statistics. 


First, a set of N individual instances I, the SSM will be built on, needs to 
be obtained. In the field of medical engineering, it can either consist of anatom- 
ical maps of the organ of interest recorded during minimally invasive catheter 
procedures or of tomographic image segmentations (see section 3.2.1). Sub- 
sequently, all individual instances Ly,,n € [1,...N] need to be aligned in space 
yielding a set of aligned geometries T^ (see section 3.2.2). Thereafter, correspon- 
dence among them has to be established to obtain a vectorized representation 
xC for each instance n comprising Mg homogeneously sampled surface points 
SERE pI ties Fennel with x, y and z denoting the Cartesian 
coordinates of the surface nodes in instance n. Correspondence can be retrieved ei- 
ther manually by relying on expert annotations or automatically through a generic 
and iterative deformation process considered as an optimization problem (see 
section 3.2.3). 
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3.2.1 Acquisition of Individual Instances 


A population of individual instances T serves as a basis for statistically evaluating 
possible real-world object deformations among them. If a human organ is the 
object the statistical shape model should be built for, this initial cohort of indi- 
vidual instances can either be obtained by segmenting tomographic images or as 
anatomical maps recorded in the course of catheter interventions. 

(Electro-)Anatomical maps of the cardiac chambers are for example recorded 
during electrophysiological studies usually preceding ablation procedures to iden- 
tify pathological electrical excitation pathways, specify ablation targets and thus, 
guide the procedure. Commercial mapping systems, such as CARTO (Biosense 
Webster, Irvine, CA, USA) or RHYTHMIA HDx (Boston Scientific, Boston, MA, 
USA), are equipped with three orthogonal sensors at a catheter’s tip that is inserted 
by a clinician through a vein into the heart. When moving the catheter along the 
endocardial wall of the cardiac chamber, the position of the catheter can be tracked 
in real-time by evaluating the strength of a magnetic field generated by magnetic 
coils in a locator pad placed beneath the patient. In this way, the catheter position 
traced over time can be used to reconstruct the endocardial surface of the cardiac 
chamber of interest. However, the intention when recording electro-anatomical 
maps is not to obtain a detailed and exact representation of the cardiac anatomy, 
but rather to visualize and analyze spatio-temporal information of the electrical 
depolarization wave. Thus, not all anatomical structures are mapped accurately 
and dense enough, let alone examined during the mapping process at all, and thus 
impede evaluating highly variable and localized shape deformations in a cohort. 

Thus, segmentation of tomographic images, such as magnetic resonance (MR) 
or computed tomography (CT) recordings, are sometimes inevitable to obtain 
anatomically detailed and high resolution geometries of the organ to be analyzed. 
The segmentation process can be time-intensive and cumbersome when manually 
performed by trained personnel from scratch. Algorithms based on edge detection 
or clustering techniques are capable of finding possible contours of the organs in 
single image planes or across different planes automatically. These can be applied 
to obtain a suggested segmentation subject to subsequent manual corrections if 
necessary. 

Independent on the modality for acquiring the initial model population, all 
instances can be exported as triangular surface meshes comprising a point cloud 
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bounding the surface of the organ and a connectivity list containing information 
on the spatial arrangement and linking of these surface vertices. 


3.2.2 Spatial Alignment 


The triangular meshes of the individual instances in the initial patient-specific 
model population need to be spatially aligned to avoid a representation of transla- 
tion and rotation of the object in the shape statistics and ensure that only shape- 
related deformations are captured in the eigenmodes of the SSM. 

The set of spatially aligned geometries T^ can be obtained either manually by 
annotating landmarks on characteristic anatomical structures in each individual 
instance or automatically using algorithms such as the iterative closest point 
algorithm [110] for example. In either case, one geometry in the set of individual 
instances has to be chosen as a reference defining the target orientation and 
position of all other instances in the cohort and thus of the SSM to be constructed. 

Whereas an alignment using a set of landmarks can be performed with a 
standard linear least squares estimation, the iterative closest point algorithm 
iteratively translates and rotates a target instance to minimize the vertex-to-nearest 
neighbor distance to the reference instance. The former option comes along 
with the drawback of manual and expert input being required whereas a robust 
performance of the iterative closest point algorithm hinges on the quality of the 
initial estimate of the rigid transformation between each pair of instances to be 
aligned. 


3.2.3 Correspondence Retrieval 


Having aligned all individual instances, correspondence between them needs 
to be established, i.e., a set of points needs to be found located at the same 
anatomical position in all geometries. Based on these, the point distribution model 
is constructed which will provide statistical information on how each of these 
vertices moves in space in the cohort of individual instances. 

These points can again be annotated manually by experts. However, in contrast 
to the set of landmarks that are needed for the alignment described in section 3.2.2, 
a much larger amount of surface nodes needs to be annotated for defining cor- 
respondence. This is because only the spatial movement of these points are 
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represented in the SSM entailing a low spatial resolution and presumably an inac- 
curate representation of the final model as well as the need to interpolate between 
sparsely sampled surface points to retrospectively increase the mesh resolution of 
any SSM instance. 

Thus, establishing dense point-by-point correspondence in the cohort of 
aligned individual instances T^ is usually preferable and can be performed using 
Gaussian process morphable models (GPMMs) to not only automate the process 
but also to reduce the impact of inter-observer variability on manually selected 
landmarks. An overview outlining the automated correspondence retrieval process 
is depicted in Figure 3.3. Thereby, one geometry in the set of aligned individual 
instances is chosen as a reference template T$ (yellow surface in Figure 3.3) and 
is subjected to a generic deformation defined by GPMMs. The latter are defined 
by Gaussian kernels k which describe the spatial relationship between two surface 
vertices vı and v2 as: 


(3.18) 


k(v1,¥2) = s: exp (e) 


In equation 3.18, s denotes a linear scaling factor determining the height of the 
Gaussian bell and / represents the width of the kernel. These Gaussian kernels 
with a predefined variance modulated by the parameter l are applied to a subset 
of uniformly sampled vertices on the reference shape Tå. Approximating them 
with the leading eigenvectors of their Karhunen-Loève transform yields a compact 
and low-rank description of the morphable reference template (grey surface in 
Figure 3.3). Thereby, the weighting factors s in equation 3.18 introduce the free 
parameters affecting the generic deformation [111]. 

Each remaining individual instance other than the geometry selected as a 
reference template is chosen as a target mesh rå at a time (green surface in 
Figure 3.3) and the weights of the generically deformable reference template Í 
are iteratively optimized to minimize the mean distance between each vertex in 
the deformed reference mesh and its nearest neighbor in the current target mesh. 
The generic deformation manifests only in a universal movement of the surface 
points but neither in a change in the amount of nodes Mp in the deformed mesh 
nor in any rearrangement or reindexing of vertices. Therefore, replacing each 
target mesh with the deformed reference mesh leads to a set of homogeneously 
and densely sampled surface meshes T“. 
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Figure 3.3: Correspondence retrieval process One geometry in the set of aligned instances 
T^ is chosen once as a reference T4 (yellow surface) to generate the morphable model T4 
(grey surface) defined by Gaussian processes. The morphing process can be regarded as an 
optimization during which the free parameters in the morphable model are adjusted so that 
the surface-to-surface distance between the deformed reference shape Ë and the current 
target shape I (green surface) is minimized. Replacing the target shape r4 with the deformed 
reference shape found during the optimization procedure yields a representation of the target 
shape in correspondence with the reference geometry (TE). Repeating this procedure N — 1 
times while choosing another target shape TA, {n € [1,N]| n Z R} in each iteration, results in a 
set of N geometries in correspondence I. 


Introducing the notation of a shape vector by stacking x-, y- and z-coordinates 
of the Mr surface node in one instance on top of each other yields a vectorized 
representation of the Mr surface points’ coordinates for each geometry IT as 
sÇ = Ki Feli ner aie ye m£ m|”. Horizontally concatenating all of these 
column vectors for all instances yields a matrix of shape vectors. Rows in this ma- 
trix can be considered as different variables and columns as different observations 
delivering the required input for applying PCA as described in section 3.3. Thus, 
the final SSM consists of a set of eigenvectors and -values ordered by the variance 


in the data they explain. 
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3.3 Principal Component Analysis 


PCA is a technique to reduce dimensionality of datasets represented by many 
variables or features. The aim is to find the directions of the maximum variance 
in the high-dimensional feature space and project the original data onto this new 
subspace. By restricting the new data representation to a reduced number of 
leading orthogonal basis vectors, the amount of features or variables used to 
represent the single data samples can be decreased while still preserving the 
maximum possible data variation [112]. This new orthogonal coordinate system 


can be obtained by eigendecomposition of the covariance matrix C € R?*? 
computed from the data matrix X € RN“? as 
1 g 2 AT 
C= X;—X)(X;-X 3.19 
aq FX -X)(X,— X) (3.19) 


where X denotes the mean of the datamatrix X. In case dimensionality 
reduction should be performed for N shape vectors in correspondence s¢,n € 
[1,...,N] represented by Mg three-dimensional coordinate pairs as described in 
section 3.2.3, the data matrix is of the following form: 


Xii A2ı AN, 
C 
Yia 921 YN,1 
Xsm= | ĝi Zi -- Wa (3.20) 
Cc C C 
ZIMR *2.Mp `` ZN MR 


where 3 - Mr = D for the calculation of the covariance matrix with equation 3.19. 
In case, PCA should be performed on time series of D 12-lead ECGs, the samples 
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can be arranged in a data matrix as follows: 


I (t),..-,¢p) Ih (t,...,tp) see's V6,(tı,...,tp) 
b(tı,...,tp) Ik(tı,...,tp) EUREN V&(tı,...,tp) 

XrcG = : i ; (3.21) 
In(t,---,tp) IIp(t,...,tp) ois V6p(ti,...,tp) 


whereby the ECG time trace comprising tp sampled time steps are concatenated 
horizontally for each of the 12 leads in the order L IL III, aVR, aVL, aVF, V1- 
V6. Thus, 12. P = N holds for the calculation of the covariance matrix with 
equation 3.19. 

Regardless of the entries in the data matrix X, diagonalizing the covariance 
matrix C leads to a set of eigenvalues and eigenvectors 


C=VLV' (3.22) 
where each column in V represents an eigenvector v;,i € [1,...,N — 1] and each 
entry in the diagonal matrix L denotes an eigenvalue A; in decreasing order of the 
total variance. Thus, the projection of each data sample x;, j € [1,...,D] onto the 


new feature subspace can be expressed as: 


N-1 
xj=X + Yj Ai: vi (3.23) 
i=l 
By reducing the number of eigenmodes N — 1 in equation 3.23, the data sample 
x; can be represented by the respective amount of the newly created, transformed 
variables r, which are also referred to as principal component scores in the 
following. 


3.4 Machine Learning 


A machine learning model aims at analysing data and identifying patterns among 
them. During the development process, a statistical model is first trained on an 
explicit training set and subsequently evaluated on an independent test set. This 
serves the purpose of ensuring that the model learns to derive patterns from the 
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training samples transferable to previously unseen data instead of overfitting to 
the provided training data. Leaving only one fixed subset of data out for the 
testing procedure is commonly referred to as hold out-validation. Alternatively, 
k-fold cross-validation can be performed describing the process of subdividing 
the total data available into k subsets, saving one subset at a time for testing while 
combining the remaining ones for training. The network is repeatedly trained 
k times from scratch while rotating the specific fold left out for testing, so that 
after all, each subset was used for testing once. Most times, cross-validation is 
preferred over hold out-validation as the source of evidence that the model is 
capable to generalize well to unseen data is increased by a factor of k. [113] 

In supervised learning, the network is fed with a set of input data and associated 
ground truth annotations. Between these two entities, the network has to make 
connections and learn relationships to perform the annotation task for the unseen 
data during testing itself. In case the ground truth annotations are defined as 
discrete labels and can be summarized in a dictionary, the network is meant to 
perform a classification task [114]. If the annotations are continuous numbers 
instead, the network is denoted as a regression model. In unsupervised learning, no 
ground truth annotations are provided. Instead, the model should find similarities 
among the training data and derive rules to cluster them into appropriate groups 
itself. 

The metrics to assess the performance of the network depend on the specific 
task it should perform. For classification problems, confusion matrices can be 
specified visualizing the relation between correctly labeled and misclassified data 
samples in the test set (see Figure 3.4). 

From the confusion matrix, accuracy (ACC), sensitivity (or true positive rate, 
TPR) and specificity (or true negative rate, TNR) can be calculated class-wise as 


follows: 
TP+T 
ACC = (3.24) 
TP+TN+FP+FN 
TP 
TPR = ———ə (3.25) 
TP+FN 
T 
TNR= ze, (3.26) 
TN+FP 


with TP, TN, FP, FN being the rate of samples correctly classified as positive, 
correctly classified as negative, wrongly classified as positive and wrongly classi- 
fied as negative, respectively, when considering the class of interest as positive. 
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Figure 3.4: Schematic representation of a confusion matrix to visualize (mis-)classification 
results. 


Especially for binary classification problems, oftentimes a receiver operating char- 
acteristics (ROC) curve is constructed by visualizing sensitivity over 1-specificity 
when varying the discrimination threshold of the model. The area under the ROC 
curve (AUC) thus provides another metric to assess the network performance. 
Moreover, an optimal discrimination threshold as the best possible trade-off be- 
tween sensitivity and specificity performance can be chosen by selecting the point 
on the ROC curve closest to the top left corner. For regression models, the mean 
squared error (mse) or root mean squared error (rmse) can be calculated between 
the predicted ($) output and the target (y) annotation for all test samples N as: 
la së 
mse = = 2> ($—y) (3.27) 


(3.28) 


The set of input data provided to the network can be of different shapes 
and sizes. Deep learning models are capable of directly extracting high-level 
features [115] from either images in case of convolutional neural networks or 
multi-dimensional time series in case of long short-term memory networks (see 
section 3.4.2) themselves. Shallow networks (see section 3.4.1) instead require a 
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distinct feature extraction step from the raw data prior to providing them as input 
to the network tailored at performing the intended regression or classification task. 

Numerous types of machine learning models exist, two of which employed 
for studies presented in this thesis are described in section 3.4.1 and 3.4.2. 


3.4.1 Neural Networks 


The topology of a neural network typically involves an input layer, one or more 
hidden layers and an output layer. Each layer comprises multiple neurons that 
are linked to the neurons in subsequent layers [116, 117] (see Figure 3.5). In a 
feedforward neural network, these connections exclusively point in the direction 
of the output layer and contain no feedback loops as is the case in recurrent neural 
networks instead. 

Feature values extracted from the available dataset are propagated from the 
input to the output layer. In each neuron j, the input values h; are multiplied 
with the weights w;; and their sum is then passed on to an activation function to 
account for non-linearities between data samples [118]. The resulting value is 
then propagated to a neuron in the successive layer that processes it accordingly. 
Once the entity in the output layer is reached, an error function determines the 
deviation between the ground truth annotation and the prediction of the network. 
To minimize this error, the weights in each neuron are adjusted iteratively through 
backpropagation. Thereby, the gradients of the error function with respect to 
the weights of the neurons are calculated in each layer and combined by the 
chain rule for partial derivatives. Approaches like gradient descent or the ADAM 
procedure allow for updating the weights until the error function converges to a 
minimum. Besides the weighting factors that are implicitly optimized during the 
learning process, a neural network typically comprises numerous other tunable 
hyperparameters. Among them are the number of layers or neurons themselves 
or the learning rate, which governs the pace for optimizing the weights during 
backpropagation. 


3.4.2 Long Short-term Memory Networks 


Whereas shallow feedforward neural networks rely on extracted features ob- 
tained from an indispensable pre-processing step, deep neural networks, are 
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Figure 3.5: Schematic representation of the structure of a feedforward neural network (top 
panel) and a neuron (bottom panel). An input, one or multiple hidden and an output layer 
comprise the basic framework of the model. Each layer consists of multiple neurons, multi- 
plying the input values h with weighting factors w;;, combining them with the input function 
(£) and passing them through an activation function (o) to the next layer. 


capable of omitting this step and process the data in a format of images or time 
series [115, 119, 120]. The latter constitute the typical input format for long 
short-term memory (LSTM) networks, which evolved from recurrent neural net- 
works explained in section 3.4.1. By building on the concept of recurrent neural 
networks, the LSTM network has memory capacity, i.e., the output from pre- 
vious time steps is taken into consideration for the computation of the current 
output. This property makes it particularly suitable for time series input data 
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where sequential data samples are naturally related to one another. In contrast to 
vanilla recurrent neural networks, the design of an LSTM avoids the problem of 
long-term gradients vanishing to zero during backpropagation. Thus, the specific 
LSTM structure allows for considering both long and short term memory during 
the learning process. Equivalent to neurons in feedforward neural networks, the 
essential unit in an LSTM is called LSTM cell. It consists of an input, an output 
and a forget gate. The input gate decides whether a specific input value is impor- 
tant for learning the intended task by allocating weights and thus decides if the 
cell’s memory is affected by that value. In contrast, the forget gate controls which 
details that are to be discarded for the ongoing learning process. The output gate 
finally combines the information from the input gate and the memory of the cell 
to determine the output value A, that is not only propagated to the LSTM cell in 
the next layer but instead also to the cell of the subsequent input data sample. A 
fully connected layer can be employed for pooling the output values of the LSTM 
cells in the last layer and calculating the final classification result. 

The training of an LSTM network can be performed by optimizing weights 
based on error backpropagation as explained in 3.4.1. 
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VALIDATION OF 
SIMPLIFIED MODELING 
APPROACHES 


Chapter 4 


Comparison of Propagation 
Models and Forward Calculation 
Methods 


In this chapter, simulated atrial signals computed with different propagation 
models and forward calculation methods are compared with respect to action po- 
tential duration (APD), local activation time (LAT) and electrocardiogram (ECG) 
biomarkers. This serves the purpose to assess whether computational cost for 
large-scale simulations can be reduced by resorting to simplified electrophysiolog- 
ical models without compromising the accuracy of resulting signals. 


The content of this chapter is taken and adapted from a paper that has been 
published open access under licence CC-BY 4.0 in IEEE Transactions on Biomed- 
ical Engineering [121]. Most passages have been quoted verbatim from the 
publication. 


4.1 Introduction 


In computational cardiac modeling, the bidomain model (see section 3.1.2.1) 
is the biophysically most detailed formulation to compute the spread of the de- 
and repolarization wavefront and the electrical source distribution throughout the 
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cardiac tissue to date. Furthermore, the finite element method (see section 3.1.3.1) 
is considered the gold standard for computing the body surface potentials from 
a given distribution of the electrical sources in the heart to extract ECGs at 
standardized electrode positions. However, both methods are computationally 
expensive and are thus suboptimal for generating large in silico datasets of cardiac 
signals for machine learning applications [122, 123], or for efficiently simulating 
excitation propagation in cardiac digital twins for certain clinical applications 
with real time requirements, such as guiding ablation therapy [48] or parameter 
inference [124]. Hence, simplified models with fast solution times are needed 
to speed up the generation of in silico datasets of cardiac signals, such as LATS, 
electrograms or ECGs by several orders of magnitude [103, 104, 125]. Yet, 
the signals obtained with these simplified methods need to be physiologically 
accurate and have to highly resemble the results obtained with the gold standard 
methods for a meaningful application of the former. It is therefore essential 
to quantify the inaccuracies arising in simulated atrial signals when resorting to 
simplified computational methods to demonstrate their possibilities and limitations 
for particular use cases. While comparisons of this type have already been 
performed for the ventricles [99, 104, 126] and partly also for four chamber heart 
models [127], a study focusing on atrial electrophysiology has not yet been carried 
out to a sufficient extent. However, this is substantial since the atria stand out 
by a highly complex myocardial fiber structure, locally heterogeneous properties 
regarding ion channel and tissue conductivities and higher anisotropy ratios as 
compared to the ventricles. 

The monodomain (see section 3.1.2.2), the reaction-Eikonal (RE) (see sec- 
tion 3.1.2.4), and the Eikonal model (see section 3.1.2.3) solved by the fast 
iterative method constitute the simplified propagation models investigated in the 
study presented in this chapter. Forward calculation techniques applied in this 
study comprise the boundary element (see section 3.1.3.2) and the infinite vol- 
ume conductor (see section 3.1.3.3) methods. Simulations were carried out in 
sinus rhythm with and without the inclusion of fibrotic tissue modeled as passive 
conduction barriers [128], slow conducting tissue patches and rescaled ion chan- 
nel conductances representing cytokine effects [82, 129]. Errors were assessed 
between simplified propagation models and forward calculation methods to the 
gold standard bidomain (see section 3.1.2.1) and finite element formulations (see 
section 3.1.3.1) with metrics extracted from the simulation results on cellular, 
tissue and organ scale comprising APDs, LATs, and ECGs, respectively. 
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4.2.1 Atrial Model and Simulation Setups 


An anatomically detailed model of the torso was obtained by multi-label magnetic 
resonance image segmentation as described by Krueger et al. [130]. The contours 
of atria, ventricles, lungs, liver and torso were exported as triangular surface 
meshes. These were smoothed and resampled with an average edge length of 
0.5 mm, 5 mm, 5mm, 7 mm and 15 mm, respectively, using Meshmixer (Autodesk, 
San Rafael, CA, USA) and InstantMeshes [131] whereby details were corrected 
manually in Blender (Blender Foundation, Amsterdam, The Netherlands) to avoid 
intersecting segments and ensure a sufficient mesh quality and topology. The 
segmented atrial endocardial surfaces were fed into the pipeline described by 
Azzolin et al. [52, 129, 132] to obtain a volumetric tetrahedral bi-atrial geometry 
with a homogeneous wall thickness of 3 mm and an average edge length of 523 um 
augmented with inter-atrial connections, labels for anatomical structures and 
myocardial fiber orientation. In contrast to fully personalized approaches where 
fiber orientation can be defined based on information extracted from diffusion 
tensor imaging data [133, 134], myocardial fiber architecture was defined in a 
rule-based way [52] building on the solution of Laplace’s equation [135, 136]. 
Meshtool [137] was used to generate a tetrahedral model of the full torso while 
preserving the surfaces of the considered organs. Tags for the atrial and ventricular 
blood pools were allocated to all elements in the volumetric torso model located 
inside the surfaces bounded by the atrial and ventricular endocardial walls with 
closed valve and vein orifices. A detailed view of the torso and the atrial model is 
depicted in Figure 4.1. 

Isotropic conductivity of 0.0389 S/m, 0.028 S/m, 0.06 S/m, 0.7 S/m and 0.22 S/m 
was assigned to lungs, liver, ventricles, atrial and ventricular blood pools and the 
remaining torso tissue, respectively, as reported in previous work [34, 138, 139]. 

In order to conduct comparable experiments with the mono- or bidomain 
model that require conductivities, and the Eikonal-based models that resort instead 
to conduction velocitys (CVs), it is crucial to consistently associate conductiv- 
ities and CVs for all heterogeneous tissue regions in the atria (see Figure 4.2). 
Anisotropic and locally heterogeneous conductivities were assigned to five differ- 
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Figure 4.1: Torso model with the segmented organs and electrode positions. Transparent 
and clipped views are shown from the anterior and posterior side in the top and bottom row, 
respectively. The bottom panel shows the anatomically detailed atrial model that was aug- 
mented with fiber orientation and labels for anatomical structures. Heterogeneous conduc- 
tivity and ionic properties were assigned to spatially distinct regions in the mesh. Resulting 
APs featuring ionic heterogeneity are depicted on the right side in red together with the base- 
line Courtemanche et al. cellular model solution in blue. 


Anterior view 


Posterior view 
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ent regions in the atria comprising regular bulk tissue, crista terminalis, pectinate 
muscles, inferior isthmus, and inter-atrial connections. CVs corresponding to 
the monodomain conductivities reported in [140] for 0.33 mm resolution voxel 
models were therefore first calculated as described by Krueger et al. [141]. Using 
tuneCV [98, 100], intra- (o;) and extracellular (o¿) conductivities as well as the 
monodomain conductivities (Om) were iteratively optimized for the tetrahedral 
mesh setup described above while keeping the 0;/0, ratio fixed. For this purpose, 
five strand geometries with a length of 10 cm were generated each characterized by 
a resolution corresponding to the average edge length of one of the heterogeneous 
conductivity regions in the atria (see Figure 4.1, bottom panel). Intra- and extra- 
cellular conductivities in longitudinal and transversal fiber direction as reported 
by Clerc et al. [142] as well as by Roberts et al. [143] were initially assigned to 
the elements in the slab meshes. In an iterative optimization procedure, the con- 
ductivities were adjusted until the CVs converged to the target value derived from 
Loewe et al. [140]. In this way, the originally reported intra- and extracellular 
conductivity values were scaled while the ratios between them were kept constant 
along the eigenaxes [100]. In the following, the tuned conductivities obtained by 
iteratively scaling the values reported by Clerc [142] and Roberts et al. [143] in the 
tuneCV setup are refered to as "Clerc" and "Roberts" conductivities, respectively. 
The resulting heterogeneous and anisotropic conductivity setup for each atrial 
region is summarized in Table 4.1. For the monodomain simulations, two different 
cases were considered which are refered to as "monodomain with and without 
explicit conductivity tuning". For the first one, the tuneCV optimization was 
repeated using the monodomain propagation model and obtained the monodomain 
conductivities listed in Table 4.1. In the second case, the monodomain conduc- 
tivities were directly computed from the tuned intra- and extracellular bidomain 
conductivities as half of their harmonic mean. The final conduction velocities and 
conductivities are summarized in Table 4.1 and Table 4.2. 

The Courtemanche et al. cell model [91] described in section 3.1.1 was used 
for the simulations in this study. To reflect regionally heterogeneous electro- 
physiology, maximum ion channel conductances were rescaled compared to the 
baseline model as reported in previous work [140, 144] and are summarized in 
Table 4.2. The final CVs values in longitudinal and transversal fiber direction 
as used for the Eikonal and RE simulations were subsequently calculated with 
tuneCV [98] based on the tissue and ion channel conductivity settings in each 
atrial region. 
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Figure 4.2: Workflow for tuning conductivities and conduction velocities with tuneCV. The 
monodomain conductivities reported in Loewe et al. (2015) were transfered in conduction ve- 
locities using the formulas reported by Krueger (2013). On a slab mesh with a resolution corre- 
sponding to the average edge length of the regions in the bi-atrial geometry, the initial intra- 
and extracellular conductivities as reported by Clerc et al. as well as by Roberts et al. were 
assigned. In an iterative optimization process, these conductivities were linearly scaled until 
the target conduction velocity was reached. In an openCARP experiment, action potential 
templates were computed using the ionic remodeling settings reported by Loewe et al. (2015) 


and the optimized conductivity settings on the same slab mesh. 
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Table 4.2: Scaling factors for ion channel conductances with respect to the baseline Courte- 
manche et al. cell model and resulting CVs in different atrial regions. 


Atrial region lonic heterogeneity CV (m/s) 
Sto Scan gna BK1 | CV cv! 

Regular atrial myocardium 0.5905 1.2465 
Atrial appendages 0.68 1.06 0.5950 1.2467 
Atrio-ventricular rings 153 0.67 0.5965 1.2456 
Crista terminalis 1.67 0.5911 1.6839 
Pectinate muscles 0.4612 1.7435 
Bachmann’s bundle 0.6450 2.1511 
Inferior isthmus 0.5402 0.5402 
Fibrosis (non conductive) 0 0 
Fibrosis (slow conducting) 01181 0.6232 
Fibrosis (ionic remodeling) 0.5 06 0.5 | 0.4812 1.0063 


4.2.2 Simulation Scenarios 


Simulations were carried out on the bi-atrial volumetric model described in sec- 
tion 4.2.1 in sinus rhythm with and without the inclusion of fibrotic tissue patches. 
For the former case, several elliptically shaped patches with their principal axis 
aligned to the macroscopic atrial fiber orientation were manually defined pre- 
dominantly on the posterior wall of the left atrium and the left pulmonary vein 
antrum as reported by Highuchi er al. [145]. These regions extended transmurally 
and are shown in Figure 4.3. To not only account for the patchiness of atrial 
fibrosis but also for its diffuse deposition, 80 % of the cells within the elliptical 
candidate regions were defined as fibrotic. In this way, the volume fraction of left 
atrial fibrosis quantified to 22 % of the total left atrial tissue volume. Remodeled 
conduction properties were assigned to fibrotic regions in three different ways as 
shown in Figure 4.3 and described in the following. 
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Figure 4.3: Overview of the different propagation models, forward calculation methods, sim- 
ulation scenarios and evaluation metrics used in this work. 


In the first case, fibrotic elements were removed from the atrial mesh and 
instead assigned to the extracellular domain following the concept of percola- 
tion [128, 146]. In this way, passive conduction barriers were introduced that do 
not exhibit any transmembrane voltage and thus do not contribute to the electrical 
source distribution on the myocardial tissue surface. In the second case, fibrotic 
regions were characterized as slow conducting patches with CVs reduced by 80 % 
in transversal and 50 % in longitudinal fiber direction compared to the healthy 
baseline case [48, 141]. Conductivities in these regions were subsequently ob- 
tained as described in section 4.2.1. In this way, anisotropy ratios were inherently 
increased by a factor of 2.5 in fibrotic areas promoting wave propagation along 
myocardial fiber orientation and thus forming the basis for functional reentry 
circuits [81]. In the third case, ionic properties of the fibrotic cells were remodeled 
by rescaling the conductances of the sodium (gya), the L-type calcium (gcaz) 
and the inward rectifier potassium current (gx1) by a factor of 0.6, 0.5 and 0.5, 
respectively, compared to the baseline conductances of the Courtemanche et al. 
cell model to account for cytokine induced effects mediated by transforming 
growth factor (TGF)-ß 1 [82, 147, 148]. 

Sinus rhythm simulations were initiated by triggering the activation propa- 
gation at a sinoatrial node exit site located at the junction of crista terminalis 
and the superior vena cava [149, 150]. The transmembrane voltage distribution 
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for the LATs computed with the Eikonal model was obtained as described in 
equation 3.13, whereby the respective ionic model parameters in each region as 
listed in Table 4.2 were taken into account for calculating the action potential (AP) 
templates. 

The Cardiac Arrhythmia Research Package (CARP) [151] and openCARP [98] 
were used for computing the spread of the depolarization wave with different prop- 
agation models as well as ECGs with the finite element and the infinite volume 
conductor method. The algorithms described by Stenroos et al. [107] were used 
for calculating ECGs with the boundary element method. As recommended by 
Schuler et al. [152], the surface mesh bounding the atria was downsampled to 
a resolution of 2.5 mm for computing the transfer matrix. Furthermore, Lapla- 
cian smoothing was applied to the transmembrane voltage sources to ensure a 
continuous wave propagation on the coarse mesh. 


4.2.3 Evaluation Metrics 


From the source distribution obtained from simulations using different propagation 
models, APDs at 90 % repolarization (APDog) were calculated for each node in 
the mesh. Also at each vertex in the geometry, LATs were extracted defined as the 
timestep with the steepest AP upstroke. For both, APDs and LATs, the accuracy 
of each propagation model simulation was quantified as the absolute difference to 
the respective value for each metric obtained from the bidomain simulation with 
the Clerc conductivities. 

To assess fidelity of simplified forward calculation methods along with dif- 
ferent propagation models, the Pearson correlation coefficient of the respective 
ECG results with the ECGs obtained by solving the forward problem with the 
finite element method based on the bidomain source distribution computed with 
the Clerc conductivities was evaluated. 
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Figure 4.4: Local activation time (LAT) results in sinus rhythm for healthy (non-fibrotic) tissue 
for different propagation models. The top panel shows the distribution of the signed LAT 
differences taking the bidomain simulations executed with the Clerc conductivity ratios as 
a reference. From left to right, the violin plots show the results for the bidomain (Roberts 
conductivities), the monodomain (with and without explicit conductivity tuning), the RE* and 
the Eikonal simulations. The bottom panel shows the signed LAT differences mapped on the 
atrial geometry for each propagation model in the above order. Mean and standard deviation 
of the absolute LAT differences are shown in the bottom row for each case. 


4.3 Results 


4.3.1 Propagation Models 


The effect of different propagation models on the activation sequence (LAT) is 
visualized in Figure 4.4. The maximum activation time in the healthy reference 
scenario solved with the bidomain model was 102 ms. In the top panel, the 
distributions of the signed differences between the examined propagation models’ 
LATs and the bidomain results obtained with Clerc conductivity ratios evaluated 
at all mesh nodes are visualized as violin plots. In the bottom panel, the difference 
to the bidomain results are mapped onto the atrial geometry. 

The mismatch in LATs was most pronounced for the bidomain scenario 
with Roberts conductivities and much smaller for the simplified propagation 
models. For the Roberts conductivity ratios, the LATs were systematically smaller 
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than the ones resulting from the reference bidomain simulation with the Clerc 
conductivity settings. Furthermore, the error increased with the spread of the 
depolarization wave front leading to small deviations close to the sinus node 
exit site, but errors of up to -14 ms at the latest activated areas at the posterior 
wall of the left atrium and the coronary sinus in the right atrium. The mean and 
standard deviation of the absolute errors between the bidomain and monodomain 
LATs with and without explicit conductivitiy tuning were 0.93+0.61 ms and 
1.02+0.64 ms. With the temporal resolution of the sampled simulated myocyte 
APs being 1 ms and the LATs being calculated as the point in time marking the 
steepest AP upstroke, in particular the LAT results for the monodomain simulation 
with additional conductivity tuning were below the accuracy with which the LATs 
were determined. RE* and Eikonal LAT differences quantified to 1.37+1.16 ms 
and 1.43+1.17 ms, respectively. The signed LAT error to the bidomain results was 
distributed similarly across the atrial tissue among these two propagation models 
(see Figure 4.4 bottom panel). 

The LAT results in the simulation scenarios involving fibrosis remodeling 
were only slightly different (see Figure 4.5) compared to the sinus rhythm results 
depicted in Figure 4.4. The largest differences occurred for the Eikonal prop- 
agation model in the simulation scenario where fibrosis was modeled as slow 
conducting tissue. There, the absolute error to the bidomain results quantified to 
1.47+1.26 ms compared to 1.43+1.17 ms in sinus rhythm without the inclusion 
of fibrosis. 


APDoo results are visualized for the simulation scenario with fibrosis modeled 
as ionic conductance rescaling in Figure 4.6. For the monodomain simulations, the 
mean and standard deviation of the absolute APDog discrepancies to the bidomain 
results obtained with Clerc conductivity ratios were below the temporal resolution 
of the AP time course of 1 ms. Absolute errors to the bidomain simulation with 
Roberts conductivity ratios and the RE* results quantified to 2.92+3.07 ms and 
1.35+1.69 ms, respectively. In both cases, the highest errors occurred in regions 
around the fibrotic tissue patches. APDoo results for the Eikonal simulation 
were characterized by an absolute error to the bidomain simulation results of 
25.1+20.72 ms. Furthermore, the AP signal trace obtained from a tissue strand 
simulation and used as a template to infer the transmembrane voltage distribution 
for the Eikonal LATs is visually clearly distinguishable from the bidomain AP 
especially in fibrotic regions (see Figure 4.6 bottom panel). 
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Simulation scenario: Fibrosis (non-conductive) 
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Figure 4.5: LAT differences between different propagation models and the bidomain simula- 
tion results obtained with the Clerc conductivities. The top row shows the mean + standard 
deviation of the absolute differences to the bidomain LATs. 


The APD results in the simulation scenarios involving other fibrosis remod- 
eling methodologies as well as in the healthy control case are visualized in 
Figure 4.7. The largest differences occurred for the Eikonal propagation model in 
the simulation scenario where fibrosis was modeled as slow conducting tissue. 

ECGs obtained from the transmembrane voltage distributions in the simulation 
scenario with fibrosis modeled as ionic rescaling (as depicted for APDoo in 
Figure 4.6) and using the boundary element forward calculation method are 
visualized in Figure 4.8. The 12-lead ECG is displayed for a duration of 450 ms 
whereby the signal sections in the interval [0 ms, 150 ms] and [150 ms, 450 ms] 
represent the P wave (panel (a) in Figure 4.8) and the atrial repolarization (panel 
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Simulation scenario: Fibrosis (ionic remodeling) 
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Figure 4.6: APDog results in sinus rhythm with fibrotic substrate replacing 22% of the left 
atrial myocardial tissue modeled as rescaled ionic conductances. The violin plots in the top 
panel represent the distribution of APDoo discrepancies to the bidomain results for all inves- 
tigated propagation models. In the bottom panel, the signed APDog differences are mapped 
onto the atrial geometry. Fibrotic regions are encircled with black dashed lines. The APs are 
shown for one node within the fibrotic area on the posterior left atrial wall. Bidomain APs are 
visualized in light blue, the other signal trace was obtained with the respective propagation 
model. The numbers in the bottom line show the mean and standard deviation of absolute 
APDop differences with respect to the bidomain simulation results. 


(b) in Figure 4.8), respectively. The latter is typically not visible in the clinical 
ECG of a full heartbeat since the repolarization phase of the atria temporally 
coincides with the ventricular activation and the respective signal parts are thus 
buried within the QRS complex. 

The observed discrepancies in the AP signal course between the bidomain 
and Eikonal simulation (see Figure 4.6) also reflect in the ECG. As can be seen 
in Figure 4.8, the repolarization signal obtained with the Eikonal and bidomain 
propagation model differ markedly. In lead aVL, the polarity of the repolarization 
wave was even inverted. Apart from the atrial repolarization in case the ECG 
signal was obtained with the Eikonal model and precomputed AP templates, 
the choice of the propagation model did not noticeably influence the ECG as 
the remaining signals in Figure 4.8 show only minor differences to one another. 
Furthermore, the correlation coefficients between the bidomain ECG obtained 
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Figure 4.7: APDoo differences between the investigated propagation models and the bido- 
main simulation results obtained with the Clerc conductivities. The top row shows the mean 
+ standard deviation of the absolute differences to the bidomain LATs. 


with the Clerc conductivity ratios and the other examined propagation models 
are summarized in Table 4.3 for the intervals [0 ms, 150 ms] (P wave), [150 ms, 
450 ms] (repolarization) and [0 ms, 450 ms] (entire signal). The lowest correlation 
coefficient for the P wave occurred for the bidomain simulation with Roberts 
conductivity ratios. For all simplified propagation models, the P wave correlation 
coefficients were above 0.92. Except for the Eikonal model, the correlation 
coefficient of the ECG signal sections representing the repolarization phase wave 
was above 0.99. 
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(c) Peak-to-peak amplitude features 


Figure 4.8: ECGs calculated with the same forward calculation method (BEM) but different 
propagation models (color coded). Transmembrane voltages resulted from the simulation 
scenario with fibrosis modeled as ionic conductivity rescaling. In panel (a) and (b) the P wave 
and P wave followed by a Ta wave are shown, respectively. In panel (c), lead specific peak-to- 
peak amplitude features extracted from the P wave signals are visualized. 


ECGs only marginally differed between the investigated fibrosis remodeling 
scenarios as shown in Figure 4.9. Also the comparison of ECG metrics in the 
other simulation scenarios led to similar results compared to those visualized and 
described for the fibrosis case with ionic conductance rescaling above. Resulting 
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Table 4.3: Correlation coefficients between the ECGs obtained with the bidomain and sim- 
plified propagation models when solving the forward problem with BEM in the simulation 
scenario of fibrosis modeled as ionic conductance rescaling. Columns represent depolariza- 
tion (P wave), repolarization (Ta wave) and the entire signal. 


[Oms, 150ms] | [150ms,450ms] | [Oms, 450 ms] 
Bidomain, Clerc 1 1 1 
Bidomain, Roberts 0.8590 0.9904 0.9028 
Monodomain 0.9942 0.9998 0.9961 
Monodomain, no tuning | 0.9931 0.9998 0.9954 
Reaction-Eikonal 0.9211 0.9953 0.9428 
Eikonal 0.9203 0.6233 0.8791 


P waves, Ta waves and feature distributions in the remaining simulation scenarios 
are shown for all propagation drivers and forward calculation methods in the 
appendix (section A). 


4.3.2 Forward Calculation Methods 


ECGs calculated with different forward calculation methods based on the same 
source distribution stemming from the bidomain simulation with Clerc conduc- 
tivity ratios are depicted in Figure 4.10 for the simulation scenario with fibrosis 
modeled as slow conducting tissue. 

The correlation coefficients covering the ECG signal parts of the P wave 
between the gold standard finite element method (FEM) approach and each of the 
boundary element method (BEM) and infinite volume conductor (IVC) method 
quantified to 0.98 and 0.83 for fibrosis modeled as slow conducting tissue (see 
Table 4.4). 

Especially the IVC method yielded too high ECGs amplitudes in the precordial 
leads and inaccurately captured atrial repolarization in the inferior leads II, III and 
aVF. 
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(b) P wave and Ta wave 


Figure 4.9: ECGs resulting from a full bidomain simulation for different fibrosis modeling 
approaches. Top panel: P wave, bottom panel: P wave and Ta wave. The blue, red, yellow 
and purple curve show the 12-lead surface ECG for the healthy case (NSR) and the fibrotically 
infiltrated atrial geometry with fibrosis modeled as slow conducting tissue (FIBSLOW), ionic 
conductance rescaling (FIBREMOD) and the percolation approach (FIBEXTRA), respectively. 


4.4 Discussion 


4.4.1 Main Findings 


In this study, atrial APDo9, LATs and ECGs computed with the bidomain, mon- 
odomain, RE* and the Eikonal propagation models as well as with the finite 


element, the boundary element and the infinite volume conductor forward calcula- 
tion methods were compared to one another. 


60 


4.4. Discussion 


oben 


voltage in mV 
ooo 
er 


> 

E2 

31 

o fe 

gor 

Sı 150 1 10 1 150 1 150 

time in ms time in ms time in ms time in ms time in ms time in ms 
(a) P wave 

= I I aVR aVL aVF 
806 l 04) 0 = be 04 
„04 ] 05 0.2 j; os i 01 02 
0.2 |; É 0% = 
s of —. — 0 0 4 oli 0 i 
S 1 150 400 1 150 400 1 150 400 1 150 400 1 150 400 1 150 400 


= N 


voltage in mV 
° 
° 
SD 
° 
ou. 
oo 
CONS 
oo 
oN + 
oo 
on + 
4 
= = 
Eh 
m= 
age 


150 400 1 150 400 1 150 400 1 150 400 1 150 400 1 150 400 


time in ms time in ms time in ms time in ms time in ms time in ms 
(b) P wave and Ta wave 
y 23 e E FEM 
PEN A BEM 
6 Š a « e vc 
eg 
4581 ae “Im ° 
ae nae |" "Ay uae aye | Se „A wA nat wae 
0 
I H Il aVR aVL aVF V1 v2 v3 v4 v5 V6 


(c) Peak-to-peak amplitude features 


Figure 4.10: ECGs calculated with the same propagation driver (bidomain) but different for- 
ward calculation methods for the simulation scenario with fibrosis modeled as slow con- 
ducting tissue. ECGs calculated with FEM, BEM and IVC results are represented by the solid, 
dashed and dottes lines, respectively. Peak-to-peak amplitude features extracted from the 


ECGs calculated with FEM, BEM and IVC are represented with triangle, square and circle 
markers, respectively. 


The largest deviations in LATs were observed between the bidomain simu- 
lations with Clerc and Roberts conductivity ratios. As the absolute LAT errors 
increase with the propagating wavefront, discrepancies in LATs can be traced 
back to more pronounced bath loading effects occurring with the Roberts con- 
ductivity settings (compare Figure 4.11, left panel). With a higher ratio between 
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Table 4.4: Correlation coefficients between the ECGs obtained with the FEM and simplified 
forward calculation methods when calculating the spread of the excitation wavefront with 
the bidomain model in the simulation scenario of fibrosis modeled as slow conducting tissue. 
Columns represent depolarization (P wave), repolarization (Ta wave) and the entire signal. 


[O ms, 150 ms] | [150 ms, 450 ms] | [0 ms, 450 ms] 
Finite Element | 1 1 1 
Method 
Boundary Element | 0.8507 0.9818 0.8947 
Method 
Infinite Volume Con- | 0.8299 0.8107 0.8548 
ductor Method 


extracellular bulk and isotropic bath conductivities, the depolarization wave prop- 
agates faster in close vicinity to the interface between blood pool and endocardial 
wall leading to earlier LATs throughout the cardiac tissue. Due to the thin atrial 
wall, the bathloading effect is visible transmurally and thus results in globally 
faster conduction velocities in the bidomain simulations with the Roberts con- 
ductivity setup. However, in this work, conductivities were tuned as described 
in section 4.2.1 without a bath attached to one face of the strand meshes. Incor- 
porating the bath already in the tuning process would have led to more similar 
results between the bidomain simulation results obtained with tuned Clerc and 
Roberts conductivities. This systematic underestimation of LATs also reflects in 
the ECG. The bidomain simulation with the Roberts conductivity settings yielded 
the smallest P wave correlation to the bidomain ECGs with the Clerc conductivity 
ratios and markedly shorter P wave duration. Also Sebastian et al. [153] found 
that the choice of conductivity ratios in the intra- and extracellular domain as 
well as in longitudinal and transversal fiber direction had a marked effect on CV 
and LATs. Intra- and extracellular conductivity values were derived by Clerc and 
Roberts et al. in animal experiments on specimen from excised trabecular cardiac 
bundles. Measuring intra- and extracellular current flow using microelectrodes 
allowed for a computation of the resistance and in turn the conductivity in longi- 
tudinal and transversal fiber direction in both, the extra- and intracellular space. 
Considering the complex and cumbersome in and ex vivo experiments to derive 
these parameters, fixed ratios between o; and o, along and perpendicular to the 


62 


4.4. Discussion 


Bidomain, Roberts Eikonal 


Bathloading 


colliding waves 


Wavefront (Bidomain, 


É 
Ze 
AZ 
E O 
= = 
as 
ZE 

_ 
<3 
4a 
g 


-30ms 


Collision of convex wavefronts Convex wavefront 
< 


LATs (Bidomain, Clerc) in ms q ‘ N‘ 
| L | | 


23ms 26ms 


Figure 4.11: Visualization of LAT discrepancies between the bidomain results and other prop- 
agation models. The left panel shows the wavefront position obtained with the bidomain sim- 
ulation ran with Roberts conductivities at t=22 ms. The more pronounced bathloading effects 
for the Roberts conductivity ratios becomes visible by color coding the activation wave front 
with activation times obtained with the bidomain simulation and Clerc conductivity settings. 
In this case, nodes along the wavefront at the endocardium in close proximity to the interface 
between blood pool and myocardial tissue were excited at a later time than the ones at the 
epicardium. The right panel shows the signed LAT differences between the Eikonal and the 
bidomain simulation. The top panel shows the atrial geometry from the posterior view where 
wavefronts collide and cause an acceleration of the wavefront in the bidomain, but not in the 
Eikonal model. The bottom panel shows the effect of convex wavefronts at the right atrial 
appendage as well as at the connection between the Bachmann’s bundle and the left atrium. 


myocardial fiber orientation need to be assumed when personalizing computer 
models. As a consequence, the high uncertainty of the ratio between o; and o, 
which cannot be measured patient-specifically with reasonable efforts further 
justifies the application of simplified models that do not involve uncertainties in 
non-measurable entities and only cause minor differences in LATs, ECGs and 
APDoo. 

Among all investigated simplified model solutions, the monodomain model 
yielded the most accurate results regarding activation times, repolarization behav- 
ior and ECGs. However, explicit conductivity tuning for the monodomain model 
neither had a notable effect on LATs, nor APDoo, nor on the 12-lead ECG. 

Mean and standard deviation of the absolute LAT differences to the bidomain 
results quantified to 1.37+1.16 ms and 1.43+1.17 ms for the RE* and the Eikonal 
model, respectively, which differed only slightly due to numerical jitter. The 


63 


Chapter 4. Comparison of Propagation Models and Forward Calculation Methods 


distribution of LAT discrepancies to the bidomain results mapped on the atrial 
geometry was similar for the Eikonal and the RE* model (see Figure 4.4). The 
LATs of the simplified propagation models were especially higher compared to 
the bidomain results in regions on the posterior left atrial wall. In these late 
activated areas, different wavefronts collide causing an acceleration of the wave 
in the bidomain model, which is not captured in the mathematical description of 
the (reaction-)Eikonal model. Source-sink mismatch effects caused by convex 
wavefronts entailing conduction slowing in the bidomain model cause smaller 
LATs in the Eikonal simulation results. This effect is especially visible in the 
area where Bachmann’s bundle connects to the anterior wall of the left atrium 
(compare Figure 4.11, right panel), i.e. where a small source (Bachmann’s bundle) 
meets a large sink (the left atrium). At the apex of the right atrial appendage, two 
convex wavefronts traversing the tissue from the lateral and the septal right atrial 
wall collide and cause Eikonal LATs to be smaller than the ones resulting from the 
bidomain simulation. The P waves computed with the reaction-Eikonal and the 
Eikonal source distribution showed similar correlation coefficients of 0.921 and 
0.920 to the bidomain results. However, when evaluating repolarization dynamics, 
RE? clearly led to more precise results than the Eikonal model. This reflects on 
the one hand in smaller APDog discrepancies to the bidomain simulation results. 
The small APDoo discrepancies between the monodomain and RE* simulation 
results might have occurred due to differences in the activation pattern or a 
mismatch between the diffusion term and the Ifoot current in the case of curved 
wavefronts or wave collisions causing different AP upstrokes and amplitudes 
which subsequently lead to subtle APDs changes. On the other hand, the RE+ 
model is capable of faithfully replicating both the P wave as well as the atrial 
repolarization phase in the ECG, whereas with the Eikonal model, only the P wave 
highly resembles the bidomain results. Using precomputed AP templates to obtain 
the transmembrane voltage source distribution for the Eikonal LAT results, APDog 
results were systematically smaller compared to the bidomain results in regular 
bulk tissue regions and systematically higher in fibrotic regions. The more precise 
representation of repolarization behavior in simulation results using the RE* 
model is due to local APD balancing caused by the diffusion term. Consequently, 
also the Ta wave in the ECG obtained with the source distribution derived from 
the Eikonal results only showed a correlation coefficient of 0.62 to the bidomain 
ECG. 
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ECGs calculated with the BEM highly resembled the ECGs obtained with the 
FEM. P wave correlation coefficients to the FEM approach quantified to 0.94 and 
0.93 for the simulation scenario with fibrosis modeled as slow conducting and non- 
conductive patches, respectively. In the former scenario, surface transmembrane 
voltages can be used as a source model for the forward calculation, whereas in the 
latter, volumetric sources such as primary impressed currents were necessary to 
model the effect of passive conduction barrier not contributing to the electrical 
source distribution in the heart. If the surface transmembrane voltages had been 
used as sources for the forward calculation in this case as well, an artefact in form 
of an offset in the isoelectric line in the P wave would have been induced. The IVC 
method instead yielded more inaccurate ECGs results. Especially in the septal 
and anterior leads, the ECG amplitudes were overestimated by a factor of >2 
compared to the FEM results. On the one side, this observation can be traced back 
to the method’s assumption that the atria are immersed in an infinite medium of a 
homogeneous conductivity, which does not allow considering a heterogeneous 
conductivity setup in the torso. On the other hand, the high ECG errors occurred 
predominantly in leads measured at electrode locations on the body surface in 
close proximity to the cardiac sources. Thus, neglecting the attenuating effect that 
the bounded torso volume conductor introduces, causes a more pronounced effect 
on the resulting ECGs in V1-V3. 

Simulations were run on a 16 core CPU machine (Intel Xeon Gold 6230, 
2.1 GHz). The full bidomain and the pseudo-bidomain simulation of a duration 
of 450 ms were completed in 25 and 1.5 hours, respectively. Computation time 
for the RE setup was 1.4 hours on a 6 core machine. The computation of the 
transfer matrix for the BEM approach in the case of a heterogeneous torso volume 
conductor with seven surfaces bounding the atria, the torso and other organs 
took 2 hours on a 4 core CPU machine (Intel Core i5, 2.4GHz). The speed-up 
in computation times when using simplified propagation models is comparable 
to a ventricular setup. Computational performance improved by one order of 
magnitude when using the monodomain model [99] and up to three orders of 
magnitude when using the Eikonal model [104, 125] compared to bidomain. How- 
ever, increasing the number of cores the simulations are ran on could change the 
results regarding algorithmic efficiency as different models might exhibit different 
scalability properties when parallelized to multiple threads or processes [154]. 
Solvers with strong scaling capabilities have been shown to provide the basis 
for fast simulation runs of the biophysically detailed monodomain model and of 
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forward calculation methods without any cutbacks on anatomical and electrophys- 
lological properties [155, 156]. In the simulations in this study, the degrees of 
freedom in terms of number of nodes and elements in the mesh was the same 
for all propagation models. High resolutions in time and space are required for 
numerical convergence of the bi- and monodomain solution [92]. As described 
by Woodworth et al. [157], a high mesh resolution is a necessary requirement 
for CV convergence, especially for low conductivities (see also Figure 4.12 and 
Figure 4.13). On the other side, (reaction)-Eikonal models are capable of faith- 
fully estimating activation time sequences on coarser meshes [104, 158]. The 
computational complexity of the Eikonal model depends on the method used 
to solve it [159], but is approximately O(nlog(n)) with n being the number of 
nodes in the mesh. These properties could be taken advantage of to further reduce 
computational cost when running simulations based on these simplified models. 
Computational savings using the BEM approach are on the one hand due to the 
decreased problem complexity when discretizing the domain with surface instead 
of volume elements [106]. On the other hand, coarser resolution meshes can 
be applied which is the key influencing factor for an improved computational 
efficiency over FEM [160]. 


4.4.2 Related work 


In this study, the differences in activation and repolarization times were examined 
when using different propagation models in atrial electrophysiology, which, to 
the best of our knowledge, has not been done before in a comprehensive way. 
However, comparable studies have partly already been conducted for the ven- 
tricles and four-chamber heart models. Potse et al. [127] found that activation 
using bidomain was 2 % faster compared to the monodomain approach for a 
complete cardiac cycle. Also in this study, the monodomain activation times 
were on average 1 ms higher than those obtained from the bidomain simulation. 
Considering the total time of 102 ms for a complete activation of both atria, the 
discrepancies between mono- and bidomain resulting in this study correspond to 
a relative error of approximately 1% ‚too. Pashaei et al. [161, 162] as well as 
Wallman et al. [125] found that the differences in activation times are small for a 
ventricular simulation setup when comparing biophysically detailed approaches 
and the Eikonal model. Neic et al. [104] compared extracellular potential fields, 
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electrograms and ECGs calculated with the RE and the bidomain model for the 
ventricles and concluded that the simplified model can replicate the gold standard 
results with high fidelity. The results in this work confirm the findings from previ- 
ous studies mainly conducted for ventricular simulation setups. Gassa et al. [163] 
investigated the suitability of an RE model to generate re-entrant activity on a 
bi-atrial geometry and succeeded in replicating the wave patterns resulting from 
a monodomain simulation. It was also shown in a preceding study of this work 
that the Eikonal-based models can produce activation times and ECGs resembling 
full bidomain simulation results with high fidelity in an atrial model without 
cellular remodeling placed in a homogeneous torso volume conductor [164]. The 
setup described in this chapter was extended to heterogeneous scenarios covering 
cellular and conductivity heterogeneity in both the torso and the atria and similar 
results were obtained. 

Previous studies have also investigated the application of simplified forward 
calculation methods to computed ECGs. Schuler er al. [152] suggest the calcula- 
tion of ECGs based on the BEM with coarse resolution surface meshes bounding 
the heart and the torso whereby parameters to blur the cardiac sources are opti- 
mized beforehand to avoid discontinuous wave propagation. In this way, they 
obtained body surface potentials in accurate accordance with the bidomain simu- 
lation results for a ventricular setup. However, one major drawback of the BEM 
approach is the impossibility of accounting for anisotropic conductivity in the 
myocardium [106]. However, the P wave correlation coefficients was found to 
quantify to >0.93 showing that the isotropic assumption yields similar ECGs 
compared to the bidomain results. For the IVC method instead, not only the 
assumption of isotropic myocardial conductivities but also of a homogeneous 
torso volume conductor has to be made. Moreover, the simplified assumption 
that the atria is immersed in a medium of infinite spatial extent does not hold 
true. Although the general P wave morphology was preserved, the ECG still 
substantially differs regarding peak-to-peak amplitudes in the precordial leads and 
atrial repolarization in the inferior leads as it reflects in the results of this study 
and was reported in previous work [105]. For the application field of computing 
intracardiac electrograms, the reader is referred to the review by Sanchez et 
al. [103]. 
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4.4.3 Limitations 


In this work, four different simulation scenarios were investigated comprising a 
healthy baseline case and three atrial models infiltrated with fibrosis, which was 
modeled either as slow conducting patches, non-conductive conduction barriers 
or ionic conductance rescaling. For the spatially distributed fibrotic areas (patchy 
and diffuse), none of the fibrosis remodeling scenarios had a marked effect on 
the ECG compared to the healthy baseline case. Ionic conductance rescaling, 
slow conducting fibrotic patches and percolation reflect in the ECG as a slight 
prolongation of the repolarization phase and an offset in the isoelectric line, a 
prolongation of the P wave and a decrease in peak-to-peak P wave amplitudes, 
respectively. Even though all these effects on the ECG are small, they would show 
up in a more pronounced way if different fibrosis remodeling approaches were 
combined [122]. However, it was intentionally decided to investigate the effect of 
different propagation models and forward calculation methods in each of these 
simulation scenarios separately to shed light on which fibrosis remodeling aspects 
can be accurately captured by the simplified model solutions. 

In the simulation setup described above, neither motion nor contraction of the 
atria were considered for the sake of reducing model complexity and computa- 
tional cost. Moss et al. showed that a fully coupled electro-mechanical model 
does not have any influence on simulation results regarding atrial activation and 
that resulting P waves exhibit negligible differences to the ones computed on a 
non-deforming model [165]. However, the atrial repolarization results of this study 
might be affected to a larger extent by the lack of a coupled model as previous 
studies reported a substantial impact of mechanical feedback on electrophysio- 
logical behavior in the ventricles [166, 167], especially during the repolarization 
phase [165, 168]. 

CVs were derived from the values reported in [140]. Based on them, conduc- 
tivities were computed using tuneCV [100] as described in section 4.2.1 using 
strand meshes. However, no bath loading effects, mesh and wavefront curvature 
were considered when tuning the CVs, which might lead to mismatching CVs 
and conductivities assigned to different regions in the more complex atrial ge- 
ometry. Adding a bath in the experiments set up for the tuning process could 
lead to more similar LAT and ECG results between the bidomain results with 
Clerc and Roberts conductivities on the bi-atrial geometry. Moreover, performing 
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Figure 4.12: Experiment for quantifying the error in LATs due to a coarser mesh resolution 
in the bidomain model. The top panel shows the 5 cmx2 cmx2.8 mm block mesh with a 
resolution of 528 um. The mesh resolution on the same geometry was refined to 265 um by 
linearly subdividing the tetrahedral elements (top right panel). The excitation was initiated by 
pacing from the left side of the block. When the planar wave passes through the isthmus, 
it propagates with a curved wavefront onwards. The bottom panel shows the signed LAT 
differences between the nodes in the coarse and the fine mesh. Two action potentials at 
nodes on the right and the left end of the block are visualized for the fine and the coarse 
resolution mesh. 


the tuning with a bath attached to the strand geometries would lead to different 
monodomain conductivities for the setups without explicit conductivity tuning 
while the conductivity values in case of explicit conductivity tuning for the mon- 
odomain simulation would remain unchanged. Conductivities in transverse and 
longitudinal direction needed to be scaled by a factor of 54 and 12, respectively. 
The tuning procedure caused the original transversal vs. longitudinal conductivity 
ratios reported by Clerc and Roberts et al. to change while keeping intra- vs. 
extracellular conductivity ratios constant. 

The average edge length of the atrial geometry was 523 um. To quantify the 
numerical error arising due to the mesh resolution, additional experiments were 
conducted on a 5 cmx2 cmx2.8 mm block mesh with a resolution of 528 um 
and a refined resolution of 265 um by linearly subdividing the elements (see 
Figure 4.12, top panel). 
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Figure 4.13: Experiment for quantifying the error in LATs due to a coarser mesh resolution in 
the bidomain model in case of fibrotically infiltrated tissue. The left panel shows the model 
setup with a fibrotic patch of 80% density and a planar wave propagating along the block. 
The right panel shows the color-coded differences in LATs for the coarse and the fine mesh. 


Using the same numerical settings as for the experiments on the bi-atrial 
geometry, a simulation of a planar wave passing through an isthmus and then 
propagating with a curved wavefront was run. Conductivities were adjusted 
using tuneCV [100] as described in section 4.2.1 to the CV in the regular atrial 
bulk tissue region. Maximum LAT differences between the experiments on the 
coarse and the fine mesh were 1.2 ms. Considering the total activation time in 
the block of 43 ms, the error introduced by the coarse mesh resolution was 2%. 
The root mean squared errors between two APs resulting from the simulations on 
the coarse and the fine mesh were 0.0186 mV and 0.0491 mV for the two nodes 
marked in Figure 4.12. When adding a fibrotic region to the block, the maximum 
absolute LAT error between the experiments on the fine and the coarse mesh was 
1.2 ms (~2 %) as well (see Figure 4.13) for a planar wave propagating along fiber 
direction. The latter is approximately also the case in the bi-atrial simulation 
setup where the depolarization wavefront traverses the elliptically shaped fibrotic 
patches along their main axis growing predominantly in fiber direction. However, 
if a notable transverse wave propagation had to be represented, the chosen mesh 
resolution of 523m would have been too coarse to capture the wave propagation 
at a velocity of 0.15m/s. Thus, the mesh resolution chosen for the atrial model in 
this study might introduce an error of 2 %, which is equivalent to an absolute LAT 
error of ~2 ms on the bi-atrial mesh. Due to the small root mean squared error 
between the APs of the coarse and the fine mesh, no additional discretization error 
affecting APDoo is expected. 
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45 Conclusion 


The results presented in this chapter show that the Eikonal model is capable of 
faithfully producing LATs and P waves compared to full bidomain simulations 
with a reduction of computation times by a factor of up to three orders of magni- 
tude. However, propagation models neglecting diffusion terms lack the fidelity 
in terms of repolarization as shown by APDoo deviations. Thus, RE models are 
needed, e.g., in cases where repolarization dynamics are of significant importance 
such as e.g. for re-entry mechanism studies. ECGs calculated with the BEM 
accurately resemble the FEM results for both P waves and the ECG in the repo- 
larization phase. When computing ECGs with the IVC method, the systematic 
overestimation of peak-to-peak P wave amplitudes especially in the precordial 
leads should be taken into account when evaluating P wave features. 
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PART III 


MULTI-SCALE GENERATION 
OF VIRTUAL COHORTS OF 
COMPUTATIONAL MODELS 


Chapter 5 


Development and Evaluation of 
a Bi-atrial Statistical Shape 
Model 


In this chapter, the construction and evaluation of a bi-atrial statistical shape model 
(SSM) as well as the generation of simulation-ready volumetric atrial geometries 
based thereon are described. 


The content of this chapter is taken from a paper that has been published open 
access under licence CC-BY 4.0 in Medical Image Analysis [169]. Most passages 
have been quoted verbatim from the publication. 


5.1 Introduction 


A wide range of machine learning approaches have already been proposed for 
classifying cardiovascular pathologies based on the 12-lead electrocardiogram 
(ECG) [24, 170-172]. The limitations in clinically recorded data [21, 24, 173- 
175] for training purposes of such a classifier as explained in section 1.1 call 
for simulated synthetic ECG as a source for large, representative and well con- 
trolled datasets. These datasets can be used to directly deduce diagnostic criteria 
visually [176] or to train machine learning classifiers to discriminate between 
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different cardiac diseases and healthy individuals [47, 177]. The advantage of 
using simulated over clinical data lies not only in the precisely known ground 
truth of the underlying pathology that was defined for the simulation, but also in 
the possibility to generate a virtually infinite amount of signals for each pathology 
class. 

Nevertheless, atrial, ventricular and thoracic geometrical models are needed 
for conducting electrophysiological simulations to obtain the 12-lead ECG. In this 
regard, SSMs allow to compile a wide range of realistic geometries that represent 
the variability observed in the cohort used to build the model. While SSMs of 
the human ventricles [178] and torsos [179] exist and are publicly available, an 
open shape model of both atria covering all relevant anatomical structures for 
electrophysiological simulations (atrial body, appendages, pulmonary veins (PVs)) 
is still lacking to date to the best of our knowledge. 
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Chapter 5. Development and Evaluation of a Bi-atrial Statistical Shape Model 


Different statistical atlases of the human whole heart anatomy [185-191] have 
been constructed for segmentation of magnetic resonance (MR) or computed 
tomography (CT) images by means of active shape modeling approaches and 
are summarized in Table 5.1. However, those models are usually built based on 
a small number of sample segmentations or were not made publicly available. 
Furthermore, different SSMSs of only the left atrium (LA) have been presented in 
various studies either for the purpose of simulations [180] or for characterizing 
changes in shape of the LA [66, 182-184] in patients with atrial fibrillation (AF). 
These LA models are built based on a solid number of instances, but lack the 
right atrium (RA) and often also the left atrial appendage (LAA). However, these 
anatomical structures are not only indispensable for the use case of ECG simula- 
tions. They are also of particular importance when investigating the mechanisms 
of typical atrial flutter in the RA or bi-atrial flutter and fibrillation. Additionally, 
the LAA is highly relevant for studies examining blood clot formation causing 
stroke [192], LAA occlusion as potential therapy [193] and the role of the LAA 
in the onset and maintenance of AF [194]. 

Due to the lack of ready-to-use models of the atria, a bi-atrial SSM would 
cater the need of generating geometrical atrial models representing inter-subject 
anatomical variability. These could be employed to gain a comprehensive under- 
standing of the underlying mechanisms of the onset and perpetuation of re-entrant 
activity during atrial flutter and fibrillation not only in personalized computer 
models [33, 195-197] but also in a large patient cohort [54, 198]. Thus, including 
the shape of the RA in the model as well as making the bi-atrial SSM available 
to the community, enables large-scale simulation of atrial signals. Although 
the focus of the work presented in this chapter is the application of the SSM 
for ECG simulations, its field of application is not limited to this particular use 
case. The bi-atrial model can also be exploited for other in silico approaches like 
continuum-mechanics and fluid simulations. Furthermore, active shape model- 
ing approaches [199] using the novel bi-atrial SSM could facilitate automated 
segmentation of the atria from CT or MR datasets. 

In this chapter, the development of an SSM comprising both atria based on 
manual segmentations of 47 CT and MR scans is described. Furthermore, a 
workflow to generate a volumetric atrial model based on an instance derived from 
the SSM is proposed. A major added value of this work for the community is 
the provision of the bi-atrial SSM under the creative commons license CC-BY 
4.0 together with 100 exemplary volumetric models derived from it including 
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fiber orientation, inter-atrial bridges, material tags and solutions to Laplace’s 
equation [200]. 


5.2 Methods 


The geometric representation as well as the variation in shape among a set of 
individual three dimensional objects can be described by SSMs (see section 3.2 
and Figure 3.2). Point distribution models [109] are the most common subclass 
of SSMs and require a vectorized point-based representation s, of any individual 


geometry I,, comprising M consistently sampled surface points [xn, Yp, Zn]: 


T 
Sn = [Xn,1 Yn, 1;Zn, 13+- ‚Xn,M »Yn,M»Zn,M] (5.1) 


Assuming that the spatial variations of the surface points follow a multivariate 
normal distribution, a compact representation of the sample mean and sample 
covariance matrix describing the shape variations can be obtained by applying 
a principal component analysis (PCA) to the observations s (see section 3.3 for 
further details). In this way, all N individual shapes T can be represented by a 
linear combination of N — 1 basis functions v: 


N-1 
Sy = $S + y nk Ok ` Vk , (5.2) 
k=1 


with s being the mean shape vector as well as o and v representing the 
eigenvalues and eigenvectors of the covariance matrix, respectively. r,,k represent 
the weighting coefficients for the individual eigenvectors. To obtain this parametric 
representation of the shape variations from clinical MR or CT data, a number 
of preprocessing steps have to be performed (see Figure 3.2 and Figure 5.1): 
i) segmenting the images, preferably in a (semi-)automatic manner, ii) rigidly 
aligning the resulting shapes in space, iii) establishing a dense correspondence 
between the individual shapes to obtain the shape vectors s© that were then subject 
to PCA. 
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Figure 5.1: Construction of a statistical shape model of the atria. 
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5.2.1 Acquisition of the Initial Model Population 


Three independent multi-center, multi-vendor databases [201-203] were used to 
build the SSM. Their properties are summarized in Table 5.2. The images originate 
either from healthy subjects or from patients suffering from AF. Since all of the 
challenges focused on the segmentation of the LA, 23 of the originally available 
images had to be excluded due to an incomplete capture of the inferior right atrial 
body or an inadequate signal-to-noise ratio. Since a PCA description was relied on 
for representing the atrial shape variability with which only continuous changes 
of vertex locations are representable, only subjects with 4 PVs were considered. 
The number of PVs attached to a subject’s atrium is discrete and the absence or 
the presence of an additional PV is not characterized by a continuous transition 
which is why only data samples of patients with 4 PVs were included in this 
study. According to Marom et al. [204], this PV configuration is representative 
for the majority of the population with 71 % and 86 % of all subjects in their study 
exhibiting two ostia on the right and left side of the LA body, respectively. 

In order to obtain the individual instances of both atria, a semi-automatic 
segmentation of the blood pool representing the endocardial surface of the left and 
the right atrium from MR and CT images was performed using CemrgApp [205]. 
2D region growing in several selected slices as well as 3D interpolation were 
applied to each image stack. To reduce the impact of noise or image artefacts on 
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Table 5.2: Summary of datasets used to generate the statistical shape model. 


Dataset | Source Number of | Voxel resolution 
subjects 
1 Left atrium segmentation | 30 MRI 1.25 x 1.25 x 2.7 mm 
challenge 
[203 
2 Left atrium fibrosis and scar | 8 MRI 1.25 x 1.25 x 2.5 mm 
segmentation challenge 
[201] 1.4 x 1.4 x 1.4 mm 
3 Left atrial wall thickness chal- | 9 CT 0.8 x 1 x 0.4 mm 
lenge 
[202 


the segmentation outcome, details were manually corrected. Figure 5.2 shows 
examples of incorrect segmentation results due to insufficient region growing 
performance (left column) and 3D interpolation (middle column) that made a 
manual correction indispensable. Automated segmentation with region growing 
especially failed in 2D planes where both the LA and the left ventricle are visible, 
as the mitral valve exhibits the same image intensity as the LA and the left 
ventricle. Therefore, a cutting plane between atrium and ventricle was inserted 
manually. The drawbacks of 3D interpolation were particularly affecting the areas 
around the PVs where the interpolated surface tends to close small gaps between 
the PVs and the atrial body. For 20 images in dataset 2, a segmentation of the 
LA was provided. However, the LAA was truncated in close proximity to the left 
atrial body [203] in these segmentations. Since the variability of the LAA shape 
was aimed to be incorporated in the model, the LA segmentations of dataset 2 
were adapted to include the full volume of the LAA as shown in Figure 5.2 (right 
column). The resulting segmentations were exported as triangular meshes. 


5.2.2 Rigid Alignment 


After segmenting the individual instances I’, of the atria, the resulting isosurfaces 
exported from CemrgApp were rigidly aligned in space to avoid a representation of 
translation and rotation parameters in the eigenmodes of the SSM [206]. This was 


81 


Chapter 5. Development and Evaluation of a Bi-atrial Statistical Shape Model 


Figure 5.2: Segmentation inaccuracy due to different automated segmentation methods. The 
different rows represent the axial, sagittal and the coronal plane, respectively. The images in 
the left, middle and right column show the segmentation errors due to region growing, 3D 
interpolation and a partly excluded LAA in the ground truth data colored in red, respectively. 
The green contours mark the manual corrections tailored to the correction of inaccurately 
found automated segmentations. 


performed automatically by means of the iterative closest point (ICP) algorithm 
that provides a solution to the orthogonal Procrustes problem [110]. Surface-to- 
surface distances were calculated with the vtkDistancePolyDataFilter (Kitware, 
Clifton Park, New York, USA) between each pair of individual instances Fn. The 
reference template was chosen as the surface holding the smallest mean distance 
to all other instances (compare Figure 5.3). 


82 


5.2. Methods 
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Figure 5.3: Surface-to-surface distance of each pair of individual instances. The green sur- 
face held the smallest mean distance to the remaining atria in the dataset and was chosen 
as a reference for the alignment and morphing process. The red boxed geometry outlines an 
example in three different views of an instance holding a notably larger surface-to-surface 
distance to the other data samples. 


In each iteration i, candidate correspondences [X,, InsZulr between a target 
mesh I, and the reference mesh [z were found by attributing the point with the 
smallest Euclidean distance in Ip to each node in T’,. Procrustes analysis was 
then used to estimate the linear transformation T; — consisting of rotation and 
translation — which yields the best match of the candidate correspondence points 
Rn; Ins enki with the reference points [Xr,yr,Zr]’ - After applying the estimated 
transformation T; to the points in mesh T,: 


Paii =T;- Ly; with Tno = (5.3) 


new candidate correspondences [X,, le: +1 are recursively found between 
the transformed mesh jee and I‘; at each iteration i and used for solving the 
Procrustes problem. If after several recursive calls of the function, either the 
maximum number of 150 iterations is exceeded or the convergence criterion is 
fulfilled, an optimal transformation matrix for the alignment of both shapes T, 
and Fp is found resulting in a set of N rigidly aligned shapes r^. 
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5.2.3 Dense Correspondence Retrieval 


Each aligned individual instance n comprises M,, surface points Pe ; y4 i wis 
en wu: In order to describe the variations in shape of the aligned 
instances I“ by means of a point distribution model, correspondence between 
the vertex IDs among all individual shapes have to be established. Establishing 
correspondence requires to retrieve concordant points in all shapes TÅ, so that the 
N aligned shapes are not only represented by the same amount of surface points M 
but also that each point Le ans Venn with a specific ID m represents the same 
morphological point in any arbitrary shape n. For this purpose, Gaussian process 
morphable models (GPMMs) [111] available as a built-in library in Scalismo- 
lab [207] were used to subject a reference shape r to a generic deformation in 


such a way that the deformed shape YA matches the individual aligned shape rå 
in the best possible way. This process then yields a set I“ of aligned shapes that 
are characterized by homologous, corresponding surface points. For this process, 
three independent generic deformations were defined by Gaussian process (GP) 
models. Gaussian kernels described the similarity between two distinct points x 
and x’: 


(5.4) 


12 
k(x,x’) =s-exp (==) 
The kernels were approximated by the leading eigenfunctions of their Karhunen- 
Loëve expansion as described in [111]. They were further employed to fit the 
orientation of the left and right pulmonary veins (LP V, RPV), the atrial body, as 
well as the LAA and right atrial appendage (RAA). The separation into three 
different models (atrial body, appendages, PVs) served two different purposes. 
On the one hand, the high anatomical variability of the appendages could be 
accounted for by allowing smaller inter-dependencies spanning between the points 
located on the appendages. On the other hand, the generic model varying the 
points located on the PVs was designed such that only the orientation of the PV 
ostia but not their length was affected. Built-in functions of Scalismo [207] were 
used to successively fit the three anatomical regions of each aligned observation 
rå with a custom PV model and a generic deformation for the appendages and 


the atrial body. The optimization problem of fitting the GP model l; to the 
individual aligned shapes T was solved using the Registration built-in class in 
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Figure 5.4: Morphable model for establishing correspondence at the PVs. The arrows indi- 
cate the movement of the respective vertices in the model for a variation of the four leading 
eigenvectors by —3o and +3o. The LA body and the RA shape are not affected by this PV 
model and are visualized in a semi-transparent manner. 


Scalismo with a limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) 
optimization [208, 209] minimizing the mean squared error between the vertex 


coordinates of the deformed model l. and the target shape r. To accurately fit 
the PVs, a kernel representing the orientation of the four PVs in anterior-posterior 
direction in its first four eigenvectors was constructed (Figure 5.4). This custom 
kernel was built by manually moving each of the four PVs of the reference shape 
r4 in two different directions at a time using Meshmixer (Autodesk, San Rafael, 
CA, USA). Disabling the mesh refinement option during this process ensured that 
the resulting eight shapes differed from one another only in their PV orientation 
and were in correspondence. Therefore, they could directly be applied to construct 
the custom PV model (Figure 5.4). The length of the PVs was intentionally not 
fitted since this quantity highly depends on the segmentation approach and does 
therefore not represent a proper observed anatomical variation considering the 
heterogeneous input data used in this study. A low rank approximation of a GP 
model was realized using the LowRankGaussianProcess class in Scalismo with an 
empirically chosen variance of s, = 50, l, = 40 at points representing the general 
atrial body (b) and s, = 20, l4 = 20 at the appendages to account for the higher 
anatomical variability in the appendage (a) regions. 
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5.2.4 Principal Component Analysis 


The vertices at the distal parts of the superior and inferior caval veins, the coronary 
sinus, the PVs, as well as the mitral valve and the tricuspid valve were discarded 
from the N aligned shape vectors in correspondence s€ to limit the model creation 
to atrial components relevant for electrophysiological simulations. Applying a 
PCA to these cut shape vectors in correspondence sC yields their mean shape 
sand N — 1 basis functions v along with their respective variances o7. In this 
way, an exact reconstruction of any individual shape instance IS is feasible by 
determining the coefficients r,, in equation 5.2 with an ordinary least squares 
estimation. Furthermore, additional arbitrary variation shapes Fpa in the span of 
the N — 1 basis vectors v can be derived by varying r. Under the assumption that 
the values of r are normally distributed among the observed instances r, keeping 
r in the interval [—3, +3]! yields realistic artificially generated shapes within 
the empirically observed variability. 


5.2.5 Generation of Arbitrary Volumetric Atrial Models 


The bi-atrial SSM represents the mean shape of the atrial endocardial surface and 
the variation of all point coordinates in space so that any arbitrary variation mesh 
Tar can be derived from it. However, a volumetric model of the atria, including 
inter-atrial bridges, anatomical labels and fiber orientations is required to perform 
electrophysiological simulations and to obtain realistic body surface P waves. A 
workflow was developed (see Figure 5.5) to fully automatically create a volumetric 
model based on an arbitrary endocardial surface. Since a segmentation of the 
epicardial surface from conventional MR images is usually not feasible due to 
an insufficient spatial resolution and a limited signal to noise ratio, the epicardial 
surface was augmented in a postprocessing step assuming a homogeneous atrial 
wall thickness. 

To approximate the epicardial surface, the endocardial surface of the variation 
mesh Ipar was dilated by 3 mm [210] at each point along the normal direction 
calculated as the mean of the adjacent triangle normals. Both surfaces were 
merged and intersections and holes between epi- and endocardium were corrected 
and closed automatically using the iso2mesh toolbox [211]. The closed surface 
was afterwards remeshed using Instant Meshes [131] and transformed into a volu- 
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Figure 5.5: Workflow for augmenting an endocardial surface derived from the statistical 
shape model with tags for anatomical structures, interatrial connections and myocardial fiber 
orientation. 


Definition of Fiber orientation 


metric tetrahedral mesh with an average edge length of 1 mm using Gmsh [212]. 
Contact and integrity between left and right atrium were preserved by duplicating 
the epicardial nodes belonging to the septum in regions where the left and the right 
epicardial surfaces spatially overlapped. The affected endocardial nodes were 
subsequently moved in negative direction of the surface normals to ensure a homo- 
geneous wallthickness of 3 mm. The algorithms described by Azzolin et al. [195], 
Piersanti er al. [135] and Wachter et al. [213] were used to automatically augment 
the models with Bachmann’s bundle, a coronary sinus and an upper and middle 
posterior inter-atrial connection between the LA and RA as well as myocardial 
fiber orientation and anatomical labels. The augmented anatomical structures 
are visualized in Figure 5.5. The only manual input required for augmenting the 
volumetric geometry with the aforementioned structures consists of 4 seed points 
defining the tips of the appendages and landmarks for the Bachmann’s bundle 
connection. They were defined once on the mean shape and were tracked through 
the deformations for any arbitrary variation mesh. In this way, it was ensured that 
the seed points were consistently chosen among all variation meshes. 
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Table 5.3: Conduction velocity (CV) along the transversal fiber direction in mm/s and 
anisotropy ratios (AR) in different atrial regions. 


Atrial Region CV, (mm/s) | AR 
Right atrium 739 2.11 
Left atrium 946 2.11 
Inter-atrial connections | 1093 3.36 
Valve rings 445 2.11 
Pectinate muscles 578 3.78 
Crista terminalis 607 3.0 
Inferior isthmus 722 1 


5.2.6 Parameterization of Electrophysiological 
Simulations 


100 random instances were derived from the bi-atrial SSM by drawing the eigen- 
vector coefficients r of Eq. 5.2 from a Gaussian distribution in the [-3, +3]o range. 
A fast iterative simulation was carried out for solving the Eikonal equation on 
these 100 geometries to obtain the spread of electrical activation and derive local 
electrical activation times for each node. Sinus rhythm activation was initiated 
from a sinus node exit site located at the junction of the superior caval vein and 
the RAA [149]. The atrial wall was separated into seven different anatomical re- 
gions: regular right and left atrium, inter-atrial connections, valve rings, pectinate 
muscles, crista terminalis and inferior isthmus. The conduction velocity (CV) 
along the fiber directions and the anisotropy ratio in the different regions were 
chosen as reported previously [149] and are given in in Table 5.3. 


5.3 Results 


The bi-atrial SSM is provided under the Creative Commons license CC-BY 
4.0 together with 100 exemplary volumetric models derived from it including 
fiber orientation, inter-atrial bridges and anatomical labels [200]. Furthermore, 
solutions to Laplace’s equation with various boundary conditions as proposed by 


88 


5.3. Results 


Piersanti et al. [135] are provided to facilitate the derivation of universal atrial 
coordinates [214]. The SSM is available as an h5 file encoding information 
about the mean shape’s spatial vertex locations and their triangulation. Also the 
eigenvectors and -values resulting from applying the PCA are included. The 
100 geometries were generated by sampling the eigenvector coefficients r from 
a Gaussian distribution in the [-3,+3]o range. These anatomical models are 
provided in VTK file format including fiber orientation as 3D vectors and material 
tags as scalar values in the cell data section. 

The shape statistics in the SSM resulting from applying PCA to the individual 
instances in correspondence are quantitatively shown in section 5.3.1. Further- 
more, three standard evaluation criteria for evaluating the SSM quality proposed 
by Davies et al. [64] were considered: generalization, specificity, and compact- 
ness. The generalization metric addressed in section 5.3.3 refers to the ability 
of the SSM to recreate an instance whose shape vector was excluded from the 
dataset used to create the SSM. The specificity metric (section 5.3.4) assesses the 
goodness of the model in terms of generating realistic shapes. Furthermore, the 
compactness (section 5.3.5) metric of the model increases the more the set of 
eigenvectors can be reduced while still being able to describe the majority of the 
total shape variance present in the dataset. 


5.3.1 Shape Statistics 


Applying a PCA to the cut shape vectors in correspondence sC as described in 
section 5.2.4 yields their mean shape s comprising 62,818 triangular cells and 
31,745 vertices with an average edge length of 0.929 mm. Furthermore, the 
eigenvectors and eigenvalues of the bi-atrial model were obtained. Figure 5.6 
shows the shape changes caused by varying the coefficients of the first eight 
eigenvectors. A qualitative description of the most prominent shape changes 
encoded in each of the eigenmodes - as outlined in grey in Figure 5.6 - is listed 
in Table 5.4. 


5.3.2 Correspondence 


The quality of the bi-atrial SSM was evaluated by assessing the mean vertex- 
to-nearest neighbor distances between the meshes in correspondence and their 
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Table 5.4: Summary of datasets used to generate the statistical shape model. 


Eigenmode Most prominent shape changes 


1 total volume for both atria 


2 asymmetric prominence of both appendages 
asymmetry of the left and right atrial volume 


relative position of the left and right atrium towards each other 


3 relative orientation of the left pulmonary veins towards each other 


relative orientation of the left and right atrium towards each other 


4 distance between the left and right atrium 


tilt angle of the appendages 


5 symmetric prominence of both appendages 


pointedness of the right appendage 


6 orientation of the right superior pulmonary vein 


volume of the left atrium 


7 volume of the left atrial appendage 


orientation of the right pulmonary veins 


respective original node locations. For the 47 meshes used to build the SSM, this 
mean distance across the dataset was 1.60 +0.25 mm. The mean vertex-to nearest 
neighbor distance per atrial region is visualized for all instances in Figure 5.7. 
The smallest morphing errors occurred in the appendages. In the bottom panel 
of Figure 5.7, two examples for a small and a large surface-to-nearest neighbor 
distance in the left atrium are visualized. 


5.3.3 Generalization 


To evaluate the generalization quality of the SSM, leave-one-out cross-validation 
was used and N datasets with N — 1 meshes each were defined by leaving out 
the final observation. Each of those was used to compute a reduced SSM. The 
excluded shape was reconstructed with the reduced SSM by determining the 
eigenvector coefficients using ordinary least squares. The similarity between the 
original excluded shape and the approximated one was assessed in terms of the 
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Figure 5.6: Eigenmodes of the bi-atrial shape model. The eight leading eigenmodes are dis- 
played. The different colors represent the applied eigenvector coefficients which were varied 
for each mode independently. 


Euclidean distance between the corresponding vertices. Figure 5.8 shows the 
distribution of this error metric for all instances in the dataset. The median of the 
vertex to vertex distance was below 1.56 mm among all shapes which compares 
to the order of magnitude of the MRI cross-plane resolution (0.4 mm - 2.7 mm). 
Instance 4, 11 and 43 hold the lowest Euclidean distances between the vertices 
of its original and reconstructed shape, whereas instance 47 is characterized by 
considerably high error values. Especially the 75th and the 95th percentile bounds 
comprise large vertex to vertex distances. Vertices showing larger deviations 
were located predominantly on the anterior wall and the appendages of both atria 
(compare Figure 5.8). 


5.3.4 Specificity 


The specificity of the bi-atrial model was evaluated by generating 1000 random 
shapes according to Eq. 5.2 by uniformly sampling r in the interval [—3,3]. The 
similarity between these random shapes and the respective closest shapes in the 


91 


Chapter 5. Development and Evaluation of a Bi-atrial Statistical Shape Model 


mean surface-to-nearest 
neighbor distance in mm 


Mesh in correspondence 


Original segmentation 


Figure 5.7: Correspondence results. Surface-to-nearest neighbor distance in mm evaluated 
per region between the surfaces in correspondence and the original segmentation. Two ex- 
amples for large and small correspondence retrieval errors (red and green boxes, respectively) 
are visualized in the bottom panel for the segmented left atrium (blue surface) and the mor- 
phed template (yellow surface). RA: right atrium, RAA: right atrial appendage, LA: left atrium, 
RPV: right pulmonary veins, LPV: left pulmonary veins. 


underlying dataset I used to build the SSM was assessed in terms of the root 
mean squared error (rmse) of all vertex-to-vertex distances between the randomly 
generated and the original shape. The rmse ranged from 4.65 to 10.83 mm among 
all 1000 random instances with a mean + standard deviation of 7.79 + 1.00 mm. 


5.3.5 Compactness 


Figure 5.9 depicts the total variance of the dataset explained by the model when 
including only a limited number of leading eigenvectors. 90 and 95% of the total 
shape variance in the individual segmented shapes can be covered by the SSM 
when considering only the first 18 and 24 eigenvectors, respectively. 
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Figure 5.8: Euclidean distance between vertices of the original and reconstructed shapes in 
mm. Two examples for small and large Euclidean distances (green and red box, respectively) 
are shown in the bottom panel. The blue surface represents the instance approximated with 
a reduced SSM built without this particular shape, the yellow surface the original correspon- 
dence results. 


5.3.6 Validation with Electrophysiological P wave 
Simulations 


Calculating the P wave on the 100 random geometries derived from the constructed 
SSM as described in section 5.2.6 yields the signals for the Einthoven, Goldberger 
and Wilson leads shown in Figure 5.10. The P wave duration was extracted for 
each of the 12-lead ECGs simulated on 100 random instances as the mean P wave 
durations across all 12 leads. The values of the P wave durations - arising from 
simulations in which only the atrial anatomy was varied - ranged from 85 ms to 
186 ms with a mean and standard deviation of 114.82 +20.19 ms. The probability 
density of the P wave durations extracted from the simulation results are visualized 
in Figure 5.11 in yellow. The distribution of P wave durations in the Copenhagen 
ECG study was reconstructed based on the values reported in Nielsen et al. [215] 
and are shown in blue in Figure 5.11. The interval of 90-111 ms characterizing 
P wave durations of subjects stratified with a low AF risk is marked in red. 
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Figure 5.9: Cumulative variance covered for different numbers of leading eigenvectors in- 
cluded. 


5.4 Discussion 


The main result of this study is a point distribution model incorporating the 
shape variations of the right and the left atrium as well as their appendages and 
the PVs. Moreover, a workflow was proposed for building a volumetric atrial 
model from an endocardial surface derived from the SSM. Together with 100 
example volumetric geometries generated by a Gaussian variation of the principal 
component coefficients in the [-3,+3]o range including fiber orientation, inter- 
atrial bridges and anatomical labels, the SSM is openly available [200]. 
Electrophysiological simulations covering atrial excitation spread and propa- 
gation of electrical potentials to the body surface were conducted on these 100 
example shapes. The resulting P wave durations obtained with the proposed SSM 
of 114.82 + 20.19 ms are in agreement with the P wave durations of 100-105 ms 
reported for individuals with a very low AF risk in an extensive cohort study based 
on 285,933 ECGs [215]. On the one hand, this suggests that the constructed model 
is capable of producing a large variety of variation shapes leading to realistic 
ECG feature values when compared to clinical recordings. On the other hand, it 
implies that the additional P wave duration variability observed in individuals with 
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Figure 5.10: P waves of the 12-lead ECG calculated on 100 geometries derived by modifying 
the eigenvector coefficients of the constructed bi-atrial shape model. 


increased AF risk (92-116 ms range covering 20-80% percentiles in Nielsen et 
al. [215]) is either due to pathological anatomical variability not represented in 
the healthy dataset used to build this SSM or due to non-anatomical, functional 
changes such as CV slowing due to fibrotic infiltration of the atrial tissue [216]. 
In the constructed model, 24 eigenvectors are necessary to explain 95% of the 
total variance of the dataset. In the LA-only SSM built by Corrado et al. [180], 
the first 15 eigenmodes cover 95% of the entire shape variance. Cates et al. [182] 
reported that only 8 eigenvectors account for 95% of the total variance. However, 
these two models do neither consider the LAA nor the RA which explains the 
higher complexity of this model and in turn the need to include more eigenvectors 
to cover the majority of the shape variance. By allowing only a variation of the 
PVs orientation in anterior-posterior direction during correspondence retrieval, 
changes in the PV lengths and diameters were prevented to be reflected in the 
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Figure 5.11: Probability density functions of P wave durations in the Copenhagen ECG study 
(blue) and in the simulation study (yellow). The area highlighted in red represents the value 
range of patients holding a low AF risk (90-111 ms) as reported by [215]. 


model’s eigenmodes, which was reported as a possible limitation of the model by 
Cates et al. [182]. 

Varying the first eigenvector in the LA SSM published by Varela et al. [184] 
causes a variation of the total LA volume as it is also the case in this model 
(Figure 5.6). Also Corrado et al. [180] and Cates et al. [182] reported a change of 
the total LA size in the first eigenmode. In the latter, the first mode represents a 
dilation of the LA mainly in anterior-posterior direction, which is also the case 
for the third eigenmode of the SSM built by Corrado et al. [180], where the first 
eigenmode rather represents an elongation of the LA in medial-lateral direction. 
Cates et al. [182] also constructed a separate SSM of the LAA and found that its 
first shape parameter corresponds to a change in the LAA length which is as well 
described by the third eigenmode of the constructed model in this work. 

The shape variations encoded in the leading eigenmodes of the novel bi-atrial 
SSM are consistent with previously published LA-only SSMs as far as the LA 
is concerned. This demonstrates that the model is able to reflect the same main 
shape variations even though it is based on a dataset of less than half the size 
compared to the other models. The second eigenmode of the model represents the 
asymmetry between right and left atrial volume. This further manifests the novel 
insights into inter-subject atrial geometry variations revealing with the constructed 
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model since this shape change cannot be captured with two separate RA and LA 
SSMs. 

The generalization results demonstrate that this model is able to accurately 
predict the shape of a previously unseen atrial geometry. The specificity results of 
7.79 + 1.00 mm leave room for improvement. However, the low specificity scores 
do not result from unanatomical characteristics of the generated shapes. They 
occur rather due to the small dataset of 47 instances available for selecting the 
closest shape during evaluation. Considering the MR slice thickness of predomi- 
nantly 2.7mm (Tab. 5.2), a specificity rmse of 7.79 mm is in the range of less than 
2 voxel diameters with segmentation uncertainty adding to it [202]. The speci- 
ficity evaluation of the model therefore indicates that randomly generated shapes 
produce valid shapes with an accuracy in the range of the error susceptibility 


during segmentation. 

The atria segmented for this study originate from datasets comprising im- 
ages of not only healthy subjects but also patients with a known history of 
AF. LA enlargement has been linked with an increased risk for this arrhyth- 
mia [176, 217, 218]. To ensure that the model is not based on a biased dataset 
with predominantly enlarged left atria, the LA volume (excluding LAA and PV 
ostia) of the N segmented geometries (82.16+19.16 ml) was compared to refer- 
ence values. Pritchett et al. [219] considered all age and BMI groups in healthy 
individuals. Translating their 2D measurements to 3D volumes as suggested by Al 
Mohaissen et al. [220] yields a [—3, +3]o range of 10-130 ml with the largest LA 
volume in this dataset (122 ml) being within that range. In this way, the dataset 
could have been kept as large as possible. For the same reason, no geometries 
segmented from different image modalities or different spatial resolutions were 
excluded, which might however impact the shape statistics of the SSM. The largest 
slice thickness (2.7 mm) is in the range recommended in Salerno et al. [221] and 
relied on in previous studies [180, 182]. This indicates that the coarsest spatial 
resolution in the training dataset was still sufficient to allow covering all relevant 
anatomical structures during segmentation. Regarding segmentation, the ground 
truth LA shapes provided publicly available along with the datasets and the auto- 
mated segmentation workflows in CemrgApp were relied on to the furthest extent 
possible. However, since manual corrections were indispensable in some cases 
(see Figure 5.2), inter- and intra-observer variabilities might still have had an 
impact on the final segmentation outcome. 
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Figure 5.12: Examples from the dataset used in this work assigned to the different LAA shape 
clusters as proposed by [222]. The ChickenWing type differs from the other types by a sharp 
bend, the WindSock type is characterized by secondary lobes only in inferior direction, the 
Cactus type in inferior and superior direction. The Cauliflower type doesn't exhibit any clear 
primary lobe. 


Due to the highly complex shape and the natural clustering of the LAA, the 
shape of this structure was categorized for all 47 subjects into four classes as 
proposed by Wang et al. [222]. 11%, 47%, 36% and 6 % of the segmented 
shapes were assigned to the ChickenWing, WindSock, Cauliflower and Cactus 
morphology classes, respectively. Examples from this dataset assigned to each 
of the four shape clusters are shown in Figure 5.12. The distribution of the LAA 
shape in the four categories are in accordance with the frequency of occurrence of 
the 612 LAA shapes studied by Wang et al. [222] in each class. This demonstrates 
that the dataset used in this work also exhibits similar variability concerning the 
LAA shape as observed in larger cohort studies. 
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PCA representation was chosen to describe the atrial shape variability and 
thereby assumed the changes in the atrial shape to be Gaussian distributed which 
does not hold true for the entire inter-patient shape variability [66, 222]. For the 
sake of comparability with previously published models and for reproducibility, it 
was intentionally decided to follow this state-of-the-art approach. Furthermore, 
the reference shape for the rigid alignment and the correspondence retrieval was 
chosen to be the geometry with the smallest mean vertex to nearest neighbor 
distance to the other shapes. The choice of the reference mesh for ICP only 
influences the alignment results regarding the final orientation of all aligned 
meshes. Since the mean vertex-to-nearest neighbor distances between the meshes 
in correspondence and the original instances quantified to 1.60 + 0.25 mm, it 
can be inferred that the fitting algorithm for correspondence retrieval performed 
sufficiently well. Therefore, the choice of the reference shape should not have any 
major influence on the correspondence estimate. In contrast, with the particular 
choice of the reference mesh, it was facilitated that the deformed atlas is still 


capable of representing finer structural changes present in other models and prevent 
that finer structures that would only be present in the atlas are over-represented in 
the final SSM. 

To showcase a potential application, multiscale electrophysiological simula- 
tions were conducted on 100 instances of the shape model. This proof of concept 
was deliberately based on a simple model not considering locally heterogeneous 
atrial wall thickness [202, 210, 223], disease-induced remodeling of membrane 
dynamics [224] or diffusive aspects during cardiac depolarization. Also only 
one torso shape and no rotation and translation of the atria within the torso are 
considered. The pipeline to generate volumetric models and simulation setups 
from the SSM is prepared for such extensions, though. Even though the shape 
variability covered by the model only comprises anatomical variants that were 
present in the dataset, i.e. healthy and AF patients without considerably enlarged 
left atrial volumes, the geometries can still be used for simulations of pathologies 
not affecting the atrial anatomy but reflecting in non-healthy functional anomalies, 
for example conduction blocks or ionic remodeling processes. Future studies 
focusing on repolarization could replace the simplistic Eikonal coupling employed 
here with a reaction-Eikonal scheme as suggested by Neic et al. [104]. 

The main advantage of the novel bi-atrial SSM consists in the automated 
generation of a large number of atrial geometries. In this way, the cumbersome 
and time-consuming process of anatomical model generation involving MRI 
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segmentation and defining bundles and fiber orientation can be facilitated and 
expedited. 

Potse et al. [225] examined the influence of electrical and structural remodel- 
ing on the maintenance of complex reentrant activitiy. With the proposed bi-atrial 
SSM, also the influence of the general atrial anatomy on the perpetuation and 
initialization of AF can be quantified. 

Saha et al. [226] investigated the effect of endo-epicardial activation delay on 
the P wave morphology. However, only one atrial geometry was used to deduce 
models of different atrial wall thicknesses in this study and the authors state 
the lack of using different geometries as a limitation of their work. The same 
limitation is listed in the study of Pezzuto et al. [227] aiming at quantifying the 
beat-to-beat variability of P waves in patients with AF. With the SSM, a larger 
number of different volumetric atrial models is easily acquirable. 

By means of the bi-atrial SSM, scale-large cohort studies using computer 
models for simulating atrial activity become feasible. Luongo et al. [228, 229] 
found a significant influence of the number of atrial anatomies included on the 
classification of different types of atrial flutter with a machine learning approach. 
With the proposed SSM, a large number of geometries can be deduced and 
used as a basis for in silico big data approaches such as to produce extensive 
datasets for machine learning applications. The provided instances are ready to 
be used off the shelf in available computational simulation environments such as 
openCARP for electrophysiology [98, 230], openFOAM for fluid dynamics [231], 
FEniCS for continuum-mechanics [232] or life* for coupled electro-mechanical 
simulations [136, 233]. 
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Chapter 6 


Generation and Calibration of q 
Multiscale Population of Atrial 
Models 


In this chapter, the generation and subsequent calibration of atrial computational 
models comprising anatomical variability of the atria as well as electrophysiologi- 
cal variability on cellular and tissue scale are explained. 


This project was carried out during a research stay at University of Oxford, 
Oxford, UK, in collaboration with Albert Dasi, Alfonso Bueno-Orovio and Blanca 
Rodriguez. Albert Dasi designed and conducted the study for generating and 
calibrating the population of cell models and provided all data used to enrich 
the atrial computational models with cellular variability. Blanca Rodriguez and 
Alfonso Bueno-Orovio provided guidance and supervision on all aspects of the 
project. 
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6.1 Introduction 


Large-scale datasets of cardiac signals obtained through multiscale electrophysio- 
logical simulations carry the potential to provide systematic insights into vulnera- 
bility, maintenance and termination of cardiac arrhythmias in entire subpopulations 
of patients. As such, they have been applied in various previous studies as an 
underlying database for e.g. disease classification and prediction with machine 
learning algorithms [47, 229] or for developing new pharmacological therapy 
options in in silico clinical trials [54, 198]. In these fields of applications, simu- 
lated datasets are particularly preferable over clinical data as they can represent a 
wide range of inter-patient variability and are not affected by ground truth label 
and signal quality distortions arising due to expert annotation uncertainties or 
interfering noise sources. 

For single-cell simulations, populations of candidate cellular models are typi- 
cally generated by varying cell model parameters within wide ranges [54, 234]. 
Subsequently, a calibration of this initial model cohort is indispensable to not only 
ensure that the virtual cohort is composed of an extensive amount of instances, 
but also that the simulated signals comply with real-world inter-subject variability 
from experimental findings [235]. In this regard, biomarkers are extracted from 
both the simulated and experimental action potentials and calcium transients and 
are used to prune model instances not in line with clinical observations. 

For 3D geometries representing atrial anatomy, statistical shape models can 
form the basis for sampling various anatomical endocardial instances representing 
the variability in shape encoded in the atlas [169]. Augmenting these models 
with a homogeneous wall thickness, tags for anatomical structures, rule-based 
myocardial fiber orientation and interatrial connections allows for generating 3D 
atrial models ready to use for electrophysiological simulations [52]. 

A combination of electrophysiological variability in form of populations of 
cell models and anatomical variability presents a promising approach to compose 
a large and diverse cohort of atrial models. As reported by Rodero et al. [185], 
only considering anatomical variability for cohort simulations is insufficient for 
representing the entire signal variability occurring in clinical practice. Thus, it is 
advisable to additionally include functional variability, for example in the form of 
conduction velocity (CV) and anisotropy changes, for compiling an initial model 
cohort (see section 6.2.1). In a step-wise approach, clinical reference ranges for 
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anatomical properties as well as local activation times (LATs) and action potential 
and electrocardiogram (ECG) biomarkers are employed to exclude parameter 
combinations yielding simulation results that do not comply with the signal 
variability observed in clinical recordings [236] (see section 6.2.2). 


6.2 Methods 


6.2.1 Generation of the Initial Candidate Model 
Population 


Figure 6.1 shows an overview of different types of electrophysiological and 
anatomical variability that were considered to compile the initial candidate model 
cohort. 

Cellular electrophysiological variability was realized by scaling maximum 
ionic conductances of 12 key parameters in the Courtemanche et al. [91] cell 
model by up to +50 % of their baseline value [237]. These parameters include 
the ultrarapid, rapid and slow delayed-rectifier K* current density (Gkur, Gkr 
and Gķęs), transient outward K T current density (Gio), Gki, GcaL, Gna, Na! /K + 
pump (Gnak), Ca2+/NaT exchanger (Gncx) and the sarcoplasmic reticulum Cat 
release (G,e1), leak (Gieak) and uptake (Gup) currents. 

To include atrial anatomical variability in the population, 4,000 atrial endocar- 
dial surfaces were derived by varying the leading 24 eigenmodes of the bi-atrial 
statistical shape model (SSM) (see chapter 5) with a latin hypercube sampling 
scheme. By including 24 eigenmodes, 95 % of the total atrial shape variation in 
the SSM were accounted for (see section 5.3.5). An interposed calibration step 
ensuring realistic left and right atrial volumes (see section 6.2.2) preceded the 
selection of 100 distinct atrial shapes that were continued to use for generating 
the volumetric instances constituting the anatomical basis for the initial candidate 
model population. 

On tissue scale, 15 spatially distinct atrial regions could be robustly annotated 
on the geometrical models using the pipeline developed by Azzolin et al. [52] 
to which heterogeneous conduction properties in terms of CVs and anisotropy 
ratios can be assigned (see Figure 6.3). A sensitivity analysis was carried out to 
identify key CV parameters affecting the body surface P wave [238] and in turn, 
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reduce the number of CV and anisotropy parameters to be varied for generating 
the cohort. In table 6.1, the ranges, in which CV and anisotropy were varied for 
the sensitivity analysis, are summarized. As an input for sensitivity analysis, a 
set of 17,280 combinations of different regional CVs and anisotropy ratios was 
compiled using latin hypercube sampling. The Eikonal equation was solved using 
the Fast Marching method and the sinus rhythm ECG was subsequently obtained 
as described in section 3.1.2.3. In contrast to the simulations covering cellular 
electrophysiological variability for the generation of the candidate population, 
only the baseline Courtemanche er al. [91] action potential was used as a template 
in all atrial regions for the sensitivity analysis. The forward problem was solved 
using the boundary element method (BEM) (see section 3.1.3.2). Additionally, 
a baseline 12-lead ECG P wave was calculated by applying the reference set of 
CVs and anisotropy ratios as summarized in table 6.1. For each of the 17,280 
P waves resulting from the latin hypercube CV and anisotropy sampling, the L2 
norm of the difference to the baseline ECG was calculated. This value was used 
as an output parameter for the sensitivity analysis to examine which regional CV 
and anisotropy parameters can cause a marked change in the ECG compared to 
baseline model parameters. Sensitivity was quantified by calculating correlation 
coefficients between the input model parameters (CV and anisotropy ratios in 15 
atrial regions) and the cost function (L2 norm distance to the baseline P wave). 

CV and anisotropy parameters with a sensitivity coefficient above 0.1 were 
selected to generate a set of 2,000 CV and anisotropy combinations using latin 
hypercube sampling to be combined with cellular and anatomical variability 
as will be explained in section 6.2.2. Functional parameters in regions with 
lower sensitivity coefficients were kept constant at their reference value listed in 
Table 6.1. 

Each of the 100 volumetric atrial instances were randomly joined with 20 
combinations of cell models and CV settings. Thus, the initial candidate popu- 
lation counted 2,000 multiscale model instances that were subject to additional 
calibration steps based on LATs as well as ECG biomarkers and morphology. 
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Figure 6.1: Electrophysiological and anatomical variability characterizing the initial candi- 
date model cohort. 
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Table 6.2: Scaling factors for Gio, GcaL and Gk, in 6 different atrial regions to account for 
spatially heterogeneous ionic properties in the 3D simulations. 


Atrial Region Gto Gcar | Gkr 
Right atrium 1 1 0.625 
Left atrium 1 1 1 
Crista terminalis 1.35 | 1.6 0.9 
Pectinate muscles 1.05 | 0.95 | 0.9 
Left atrial appendage | 0.65 | 1.05 | 2.75 
valve rings 1.05 | 0.65 | 3 


6.2.2 Calibration of the Initial Model Population 


Instances of the cell model were calibrated against experimental data obtained 
from patients under sinus rhythm and atrial fibrillation (AF) [198]. The retained 
ionic models were employed as baseline models for generating the multiscale 
population of atrial computational models. When coupling an ionic model with 
an anatomical geometry, selected ionic conductances were additionally scaled 
in different atrial areas to account for region-wise cellular heterogeneities (see 
Table 6.2). 

References ranges for left and right atrial volumes were taken from literature 
reports based on large cohort studies analyzing extensive patient datasets [17, 248]. 
Likewise, threshold values for atrial volumes associated with an increased risk 
for AF were extracted from previous studies. The generated 4,000 atrial shapes 
were compared to the reference volume ranges and only instances with left and 
right atrial volumes inside these ranges were retained for the selection of 100 
instances with maximum Dice coefficients to each other. The Dice coefficient 
served as a similarity metric to chose the most distinct instances for which 3D 
atrial volumetric geometries were created as described by Azzolin et al. [195]. 

To compile an initial population of 2,000 atrial models covering the different 
types of above mentioned variability, each atrial geometrical model was randomly 
combined with 20 calibrated cellular models and CV settings. 

Monodomain simulations were carried out to calculate the spread of the elec- 
trical depolarization wavefront and to derive LATs from them. The approach 
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described in section 4.1 was relied on to convert the chosen CVs into conductivi- 
ties defined based on the intra- vs. extracellular conductivity ratios reported by 
Clerc [142]. Thereby it is important to note, that the monodomain conductivi- 
ties were calculated based on the baseline Courtemanche et al. [91] cell model 
without taking ionic variability resulting from including cellular heterogeneity in 
the population into account. Thus, the effective CV in each tissue region could 
deviate by up to around +70 mm/s from the chosen target CV on tissue level 
mainly driven by marked in- or decreased sodium conductances. Subsequently, 
the forward problem was solved with the infinite volume conductor method (see 
section 3.1.3.3), using the electrode positions defined on the mean shape of the 
human body SSM developed by Pishchulin er al. [179]. Calibration on tissue 
level was performed by comparing the simulated activation times at 22 anatomical 
landmarks (see Figure 6.2) in both atria to clinical reference ranges reported by 
Lemery et al. [249]. Additionally, the maximum activation times in both the left 
and right atrium were added for the calibration on tissue scale based on LATs. 
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Figure 6.2: Landmarks for calibration of the candidate multiscale model population based on 
LATs. 22 landmarks were selected on the left and right atrium. For these, clinical reference 
intervals of activation times in normal sinus rhythm were reported. They were employed to 
exclude model combinations with LAT values outside these ranges. 


Finally, the forward calculated ECGs were employed for a final calibration 
step to exclude model instances out of range regarding P wave duration and ECG 
morphology. A principal component analysis (PCA)-based approach was used to 
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identify multiscale model instances leading to unrealistic ECG morphology which 
is explained in greater detail in the following. 

The PTB-XL dataset [16] served as the reference for clinical ECG signals. 
For each of the 9,528 healthy subjects in this database, a representative P wave 
template was calculated by averaging all detectable P waves over the entire 
recording time as described by Pilia et al. [250]. These P wave templates of all 12 
leads were then concatenated to one signal vector (see Figure 6.8) as implemented 
by Welle et al. [251]. These signal vectors were subsequently subject to a PCA 
yielding a mean vector [erin and eigenvalues and -vectors A,jjn and Verin: These 
quantities span an eigenspace representing the P wave morphology occurring in 
the clinical signals (compare Figure 6.8). The P waves of all retained simulated 
signals were concatenated likewise and approximated with the first 48 eigenvectors 
Vein accounting for 95% of the total variance in the clinical dataset using least 
squares estimation to obtain Cm: 


48 
ECGsim = Hatin oT y Ci,sim ` Aiclin ` Vi,clin (6.1) 
i=1 
Afterwards, the reconstruction errors € of the approximated and the actual 
simulated signals were evaluated by the root mean squared error (rmse) between 
them as 


E= J3 F (ECG) — ECG(t;))? (6.2) 
i=1 


with z; being the discretized time step values, N the total number of sampled values 
in the concatenated ECG vectors, ECG the approximated and ECG the actually 
simulated signal vector (compare Hotelling filter [252]). Simulation runs leading 
to an error € > 0.1 were not accepted in the final population (see Figure 6.11 for 
an example). 
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63 Resulls 


6.3.1 Sensitivity Analysis for Tissue-scale 
Electrophysiological Model Parameters 


In Figure 6.3 (b), the absolute sensitivity coefficients for regionally heterogeneous 
CV and anisotropy ratios are visualized. It is apparent that CV variation in 
left atrial bulk tissue, right atrial bulk tissue (lateral and septal wall) and crista 
terminalis had the most pronounced effect on the ECG. Anisotropy ratios played 
a minor role and did not cause as much of a change in the ECG as CV. CV 
in the four above mentioned regions were considered to compile a set of 2,000 
CV variations by latin hypercube sampling their values in the ranges reported in 
Table 6.1. CV in the remaining 11 and anisotropy values in all 15 regions were 
kept constant at their baseline values. 


6.3.2 Cohort Calibration 
6.3.2.1 Atrial Volumes 


Applying the clinical reference indices of [26 ml, 112 ml] and [34 ml, 148 ml] [248] 
for the left and the right atrial volumes, respectively, as lower and upper bounds 
to calibrate the anatomical model population leaves 3,925 out of the initially 
generated 4,000 endocardial surfaces fulfilling the clinical volume requirements 
(black enclosed areas in Figure 6.4). Dividing the atrial volumes bounded by the 
endocardial surface by the body surface area of the mean torso that was used 
for the ECG calculation in this study (1.9 m7), allows for their comparison with 
clinical thresholds for identifying patients with enlarged atrial volumes associ- 
ated with an increased AF risk. Out of the retained 3,925 calibrated endocardial 
shapes, 739 models were characterized by both left and right atrial volumes above 
the threshold of 34 ml/m2 and 42 ml/m? [88], respectively, and could thus be 
considered predisposed to develop AF based on their anatomical properties. 
Calculating the Dice coefficient as a similarity metric between each pair of 
the retained 3,925 anatomical instances leads to the selection of 100 endocardial 
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(b) Sensitivity coefficients for conduction velocity and anisotropy variation in 15 different 
anatomical regions. 


Figure 6.3: Absolute correlation coefficients between CVs and anisotropy ratios in 15 differ- 
ent atrial regions and the L2-norm to the baseline ECG. 


surfaces to be further processed through the pipeline developed by Azzolin et 
al. [195] to obtain simulation-ready volumetric atrial geometries (see Figure 6.3). 
Figure 6.5 depicts 5 examples of fully processed atrial anatomical models from 
the dataset of 100 volumetric instances. Each of these were subsequently paired 
randomly with 20 cell models and 20 sets of regionally heterogeneous CVs and 
employed to conduct monodomain simulations (see section 3.1.2.2) to obtain 
LATs and transmembrane voltage distributions. The latter were further projected 
onto the body surface using the infinite volume conductor method 3.1.3.3 to obtain 
the 12-lead (pseudo-)ECG. 
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Figure 6.4: Left (top panel) and right atrial volumes (bottom panel) for an anatomical model 
cohort comprising 4,000 endocardial surface instances derived from the SSM. The red curve 
represents a fitted normal distribution on the data samples. The black intervals indicate value 
ranges of atrial volumes occurring in clinical practice. The volume intervals shaded in blue 
visualize the ranges that are reported to be associated with an increased risk for AF. 


6.3.2.2 Local Activation Times 


Evaluating the LAT results from the 2,000 simulations (100 geometries x 20 cell 
models and 20 CV settings) with respect to clinically reported ranges for LATs in 
normal sinus rhythm [249], resulted in 1,652 model combinations with LAT results 
inside the clinical reference intervals. In Figure 6.6, the distribution of LATs at 
each of the landmarks are shown for all 20 CV and cell model combinations paired 
with one specific geometry (atrialD_062). For this atrial anatomical model, one 
of the 20 simulation runs yielded an LAT at the left superior pulmonary vein and 
a latest left atrial activation time above the clinically measured upper bound. This 
multiscale model was subsequently excluded for the calibrated population, the 
remaining 19 instances were accepted. 
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Figure 6.5: 5 exemplary simulation-ready volumetric instances of the atria. In each row, 
one atrial anatomical model is shown from the anterior, roof and posterior point of view in 
the right, middle and left column, respectively. The colors represent the regions that can be 
allocated heterogeneous conduction properties as explained in Figure 6.3. 


Figure 6.7 shows the percentage of the entire 2,000 simulation results that are 
distributed inside the clinical u #0, u #20 and u +30 (from top to bottom) 
ranges mapped on 22 regions around each of the considered landmarks (compare 
Figure 6.2) on the mean shape. It is apparent that the highest percentage of model 
combinations had to be pruned due to LAT simulation results outside the clinically 
reported intervals at the RA mid posterior wall, the LA mid anterior wall, the left 
pulmonary veins, the mitral valve (septal wall) and the sinoatrial node. However, 
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the percentage of model instances with LATs outside the u #30 range is small 
and quantified to only 17.4 % (348 out of 2,000). 


6.3.2.3 P wave Features and Morphology 


Mean and standard deviation of the P wave duration for the multiscale model 


instances passing the LAT calibration step was 126.55+27.78 ms. Applying 
the interval of [78, 144] ms as lower and upper bound for the P wave duration 
calibration step [215, 253], leads to 1,435 remaining model instances provisionally 
accepted into the population. 

Lastly, P wave morphology was analyzed in a final calibration iteration. In 
Figure 6.8, the mean P wave in all 12-leads in the clinical dataset is shown in red. 
The [-30, +30] range of the first, second and third eigenmode are visualized in 
grey. They account for 15%, 14% and 12% of the total signal variance in the 
clinical dataset. The first eigenmode mostly represents a change in the P wave 
amplitude in lead II, aVR and V5 as well as a change in polarity in lead III, aVL 
and aVF. The second and third eigenmode predominantly affect the amplitude and 
morphology of lead V5 and V6, respectively. 

The reconstruction errors € (equation 6.2) are depicted for each lead separately 
in Figure 6.9. Marked reconstruction errors only occurred for the precordial 
leads V1, V2 and V3 (Figure 6.9). A simplified forward calculation method was 
chosen to generate the large scale dataset of 2,000 simulated ECGs in a reasonable 
amount of time. As this approach is known to overestimate amplitudes in the 
ECG leads measured in close proximity to the sources (see chapter 4, in particular 
Figure 4.10c), a threshold € for the rmse averaged over all leads except for V1-V3 
of 0.1 was chosen: 

E= Yeli), i € [L,I M,aVR,aVL,aVF, V3 — V6] (6.3) 

In this way, 764 of the up until this point remaining 1,435 multiscale model 
combinations were accepted in the population. 

In Figure 6.10, 20 12-lead ECGs calculated for the 20 different CV and ionic 
model setups in one of the 100 atrial geometries are visualized. The signal traces 
in black passed all calibration steps and were accepted in the final population 
of multiscale models. The blue, cyan and red curves represent instances that 
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Table 6.3: Number of candidate models and accepted instances for each calibration step. 
Anatomical and ionic models were generated and calibrated independently from each other. 
Calibration based on local activation times, P wave duration and P wave morphology was 
performed consecutively leaving 764 accepted instances out of 2,000 initially generated can- 
didate models. 


Calibration step Candidate population | Accepted | Rejected 
Action potential biomarkers | 275 200 75 

Atrial volumes 4,000 3,925 75 

Local activation times 2,000 1,652 348 

P wave duration 1,652 1,435 217 

P wave morphology 1,435 764 671 


were rejected due to nonfulfilling LAT, P wave duration and P wave morphology 
criteria, respectively. 

Table 6.3 summarizes the number of model instances generated for the initial 
candidate population and the amount that has passed each individual calibration 
step. 


6.4 Discussion 


6.4.1 Main findings 


In the work presented in this chapter, a cohort of atrial models comprising anatom- 
ical and functional variability on both tissue and cellular scale was generated and 
calibrated. 100 atrial geometries with calibrated left and right atrial volumes were 
each paired with 20 CV settings and 20 cellular models to generate a candidate 
model population consisting of 2,000 instances. Calibrating this candidate cohort 
based on LATs and ECG biomarkers leaves a total of 764 multiscale models to be 
used in future for electrophysiological simulations focusing e.g., on in silico drug 
trials on 3D atrial models. The variability encoded in this population account for 
three key parameters that are known to contribute to the initiation, maintenance 
and termination of AF, as the condition for a re-entry to be sustained reads as 
follows [254]: 
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PL > CV -ERP (6.4) 


with PL being the path length, CV the conduction velocity and ERP the 
effective refractory periord. In the calibrated population of models, these quantities 
are modulated by a change of the atrial anatomy, CV variation itself and different 
ionic conductances driving the repolarization behavior of the cell, respectively. 

Sensitivity analysis of CV and anisotropy ratio in 15 spatially distinct regions 
revealed that three out of the four regions where a change in CV had a pronounced 
effect on the ECGs were located in the right atrium (RA). This is in line with the 
findings from Loewe et al. [140] reporting that around 70 % of the total P wave 
integral is attributable to electrical activation of the RA. Furthermore, the left 
atrium (LA) was subdivided into spatially coarser regions, i.e., the majority of 
the tissue volume was allocated only to the LA bulk tissue class, whereas the RA 
was additionally also composed of a septal region and crista terminalis which 
account for large fractions of the total RA myocardial tissue volume themselves. 
Sensitivity analysis also highlighted that anisotropy plays a minor role compared 
to longitudinal CV in the genesis of the ECG. This could potentially be traced 
back to the fact that longitudinal CVs were defined and divided by the prescribed 
anisotropy ratios to obtain transversal CVs. Furthermore, conduction in the atria is 
in any case already highly anisotropic, so that longitudinal CVs clearly dominate 
the preferred direction of wave propagation and transversal CV mediated by 
varying anisotropy ratios does not have a strong influence on the depolarization 
spread. 

Out of the initially generated 4,000 instances of the endocardial wall, the 
largest left atrial volume was 125 ml and was in the same ballpark as the largest 
volume of the individual segmented MR instances based on which the atrial 
SSM was built (compare chapter 5). However, when indexing the atrial volumes 
to the body surface area (BSA) calculated analytically for the torso geometry 
used to compute the ECGs in this study, a total of 739 models had LA volumes 
above the threshold for diagnosing left atrial dilation associated with an increased 
AF risk. However, using a different torso geometry and thus another resulting 
BSA or accounting for under- or overestimation when approximating BSA with 
only height, weight and gender as it is clinically routinely done, the number 
of anatomical models characterized by an increased AF risk based on volume 
properties could deviate markedly. 
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A total of 348 out of the 2,000 multiscale model combinations were excluded 
due to non-fulfilled LAT calibration requirements, whereas additional 671 were 
pruned due to non-realistic ECG morphology. After the LAT calibration, 217 
additional models had P wave durations outside the clinical reference ranges. 
The highest percentage of model combinations not accepted in the population 
during this calibration step exhibited LATs in the RA mid posterior wall. The 
exact position of this landmark is not specified in more detail by Lemery et 
al. [249] and thus, a slightly different placement of this point in the in silico 
model cohort could have led to different results. The relatively low number of 
models leading to non-realistic P wave durations could be traced back to the 
fact that latest right and left atrial activation times were already considered in 
the course of the LAT-based calibration step. In this way, models potentially 
leading to too long simulated P wave durations were already excluded by the time 
ECG biomarkers were computed for the ongoing calibration process. Usually, 
other ECG biomarkers for which clinical reference ranges are reported comprise 
peak-to-peak amplitudes and P wave terminal force in V1. However, both of 
these features are amplitude based and heavily depend on the signal acquisition 
procedure, electrode contact and the torso shape. As too many confounding factors 
affect these biomarkers, they were not employed in any of the calibration steps in 
this study. 

A novel approach for calibrating computational models particularly based on 
the ECG morphology was presented. In this calibration step, the largest number of 
model instances were rejected for the population (617 rejected instances compared 
to 348 and 217 rejected instances after LAT and P wave duration calibration). This 
highlights the importance of considering the entire P wave morphology in addition 
to only selected ECG biomarkers, such as P wave duration. Even though model 
combinations could lead to P wave durations inside clinical reference ranges, they 
do not necessarily exhibit typical ECG signal time courses provoked potentially 
through unrealistic CV combinations in different atrial regions. This might indicate 
that even if CV values are globally selected in reasonable intervals, certain local 
and region-wise CV distributions could still lead to unrealistic ECG phenotypes 
which are identifiable with the novel P wave morphology calibration procedure. 
To understand why and which certain model instances were not accepted into the 
final population, a clustering approach could reveal whether specific anatomical or 
functional parameters yield unrealistic ECG signals. Interpreting the eigenmodes 
resulting from applying PCA to clinical ECGs disclose the most prominent signal 
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variability occurring in clinically measured ECGs. The first eigenmode affected 
the polarity of lead III, aVL and aVF while the morphology in the remaining limb 
and augmented leads remained unchanged, indicating that the first eigenmode 
could represent a change of the electrical heart axis among different patients. 
The amplitude change in V2 in the third eigenmode could besides physiological 
reasons also be caused by inconsistent electrode placement for different patients. 
The striking amplitude variation in V5 for eigenmode 1 and 2 as well as in V6 
for eigenmode 3 was surprising and could have been arisen due to incorrectly 
delineated P waves in these leads. 

The evaluation of the coefficients c; sim (see equation 6.1) opens up new 
possibilities for future work. By comparing them to the distribution of the scores 
resulting from PCA on clinical ECGs, one could assess if the simulated ECGs 
are representative for a wide range of clinical signals or if a certain type of ECG 
morphology is not covered by the variability in the virtual cohort so far. 

The monodomain propagation model (see section 3.1.2.2) was used to calcu- 
late the LATs and the transmembrane voltage distribution for the 2,000 model 
instances of the candidate population. The Eikonal model has been shown to 
reproduce LAT results with high fidelity compared to biophysically more detailed 
models (see chapter 4) in normal sinus rhythm simulations. For the 17,280 simula- 
tions carried out for the sensitivity analysis (see Figure 6.3), these properties were 
made use of to allow for an efficient computation of the resulting ECGs. However, 
the simulations for the 2,000 candidate model instances were intentionally carried 
out using the computationally more expensive monodomain model to ensure that 
the mesh quality is sufficiently high for simulations with propagation models 
capable of producing re-entry simulations. In this way, the proposed calibrated 
model population is ready to use for the intended purposes comprising testing the 
efficacy of different channel blockers or ablation targets on arrhythmia termination 
in silico. 

Future work could also comprise separating the retained and calibrated mul- 
tiscale atrial models extended with fibrotic tissue variants into a healthy and AF 
cohort. In this way, an ensuing validation step regarding arrhythmia vulnerability 
of both subpopulations could be conducted by evaluating dominant frequency 
maps after inducing reentry. Another application of the proposed population could 
be to identify key model properties contributing to arrhythmia susceptibility in the 
cohort and evaluate whether either slow conduction velocity, ionic remodeling, 
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enlarged atrial volumes or a combination of them is the main driver for rotational 
activity. 


6.4.2 Related work 


Several previous studies have focused on generating, calibrating and validating 
populations of cellular models. Muszkiewicz et al. [237] showed that calibrated 
cell model populations yielded action potentials which biomarkers overlapped 
with those obtained from clinical experiments. They concluded that the popula- 
tion of ionic models are thus representative for a wide range of action potential 
morphology. Furthermore, they highlighted the need of a calibration step as cell 
models with ionic conductances set to their baseline values were not able to repro- 
duce action potential biomarkers within experimental ranges, in contrast to other 
instances of the model population even characterized by substantial parameter 
variations. Britton er al. [255] generated a large population of human ventricular 
computational cell models to identify key ionic properties leading to abnormal 
repolarization behavior. Sänchez er al. [198] calibrated populations of cell models 
under sinus rhythm and chronic AF conditions. They identified key ionic con- 
ductances underlying the inter-patient variability of action potential biomarkers 
in both populations aiming at a better understanding towards the mechanisms 
behind differences in arrhythmogenesis and response to treatment of both cohorts. 
Vagos et al. [256] followed a similar approach and found that the healthy and 
chronic AF populations exhibit substantial deviations in sensitivity of the ionic 
current parameters to pro-arrhythmic action potential biomarkers. In contrast 
to the generation and calibration of the multiscale model population presented 
in this chapter, a large number of ionic model instances had to be excluded in 
the calibration steps of the above mentioned studies. Britton et al. [54] reported 
that only 213 out of 10,000 initially generated cell model instances exhibited 
action potential biomarkers complying with experimental results when varying 


ionic conductances by +100 % of their baseline values. In the study presented 
in this chapter, a larger fraction of the candidate population (764 out of 2,000 
multiscale models) passed the LAT and ECG calibration steps. This relates to the 
fact that cell model populations have to be created assuming a large variation of 
ionic conductance parameters as these quantities cannot be measured in vivo or 
are affected by the isolation process during voltage clamp experiments [255]. In 
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contrast, CV can be directly derived from clinical LAT measurements and there- 
fore, more precise clinical intervals are available to draw the parameter values for 
the initial model population from. Even more straight forward is the generation 
of anatomical models. The SSM was built based on 47 patient-specific atrial 
models. Thus, selecting the PCA coefficients in the [-30, +30] interval leads to 
endocardial geometries within the eigenvector space of the SSM, theoretically 
exhibiting already realistic atrial geometrical variations as they also occur in the 
47 underlying SSM geometries. 

The latin hypercube sampling scheme was used to generate the model pa- 
rameters for the ionic and anatomical models as well as for generating locally 
heterogeneous CV variations. This approach is well established and has been used 
in various previous studies [54, 255]. However, more sophisticated approaches 
for generating in silico populations exhibiting coinciding biomarker distributions 
with clinical data have been published. Lawson et al. [257] proposed a method 
to iteratively tune model parameters such that the resulting computed biomarker 
distribution can be compared to a set of measured values in a step-wise approach. 
This method is computationally more efficient than brute force sampling and 
calibration until in silico and in vivo data distributions match. Even though this 
sampling technique will lead to a more representative population of models for 
actual clinical data, it will also imply an under-representation of rare and unusual 
action potential morphologies. However, the ability to investigate the mechanisms 
in outlier models is one of the main advantages of using simulated over clinical 
data where these morphologies do not occur in sufficient numbers to infer statisti- 
cal information, e.g. regarding drug response. For this reason, model parameters 
were sampled using the established latin hypercube approach for generating the 
multiscale population of models. 

Regarding tissue scale atrial electrophyiology, only a limited number of pre- 
vious studies have focused on the generation and validation of virtual cohorts 
covering various types of variability. In these, the generated populations usually 
either consisted of only a small amount of anatomical instances as they had to 
be built based on patient-specific segmentations or lacked a validation step. In 
the work from Corrado et al. [258], 10 atrial anatomical models were generated 
and validated against LATs acquired during an S1-S2 pacing protocol. Roney et 
al. [259] generated 100 patient-specific anatomical models of only the left atrium 
with the aim to predict long-term AF recurrence after catheter ablation for each 
individual patient. However, no explicit validation step of the computer mod- 
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els with clinically measured biosignals, such as LATs or ECGs, was presented. 
Rodero et al. [185] showed that only modifying anatomical properties in a four 
chamber human heart model was not sufficient to explain the entire range of vari- 
ability regarding electromechanical function occurring in clinical data. Therefore, 
the multiscale population of models presented in this chapter was generated by 
additionally incorporating functional variability on tissue and cellular scale. 


6.4.3 Limitations 


Sensitivity analysis on CV and anisotropy model parameters was performed based 
on simulations with the baseline Courtemanche cell model on the mean shape 
geometry. Other anatomical shapes of the atria, e.g., with bigger appendages, 
might have led to different results as sensitivity analysis might have identified 
different CV regions affecting the P wave. However, the CV variation in the 
four regions chosen based on the sensitvity results in this study is consistent with 
spatially heterogeneous conducting regions considered in other studies (compare 
chapter 9 and chapter 8). 

For the multiscale population of atrial models, only one torso geometry and no 
additional variation in terms of the atrial rotation angles were considered. When 
including thoracic shape variability, a variation of the first two eigenmodes of 
the SSM developed by Pishchulin et al. [179] would be sufficient to describe a 
large cohort. These two eigenmodes reflect in torso shape changes associated with 
height, weight and gender differences. The ensuing eigenmodes represent a change 
in body posture which are not particularly relevant for the simulation outcomes 
and do not represent pertinent variability in a large population. Furthermore, no 
fibrotic remodeling was included in selected tissue regions on the atria up until 
now which are indeed crucial to assess arrhythmia vulnerability and to test the 
efficacy of different channel blockers. 

The forward calculation was performed using the infinite volume conductor 
method which was shown to cause a prominent amplitude overestimation in 
lead V1-V3 (see chapter 4). Consequentially, these leads were not considered 
when calibrating the population based on the ECG simulation results. However, 
applying instead the boundary element method (see section 3.1.3.2) would allow 
for using all 12 leads as an additional source of evidence for accepting only model 
instances with morphology and biomarkers in the entire 12-lead ECG within 


121 


Chapter 6. Population of Multiscale Models 


clinical ranges. A threshold of 0.1 for the averaged rmse £ was applied to decide 
between accepted and rejected model instances during the P wave morphology 
calibration step. This threshold was chosen empirically based on exemplary 
reconstructed simulated signals in the clinical ECG eigenspace. In Figure 6.11, 
two examples for an accepted and rejected (€ = 0.08 and £ = 0.12) simulation 
run are shown. Choosing a different threshold for £ would clearly have led to a 
different number of instances retained for the final multiscale model population. 
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Figure 6.6: Calibration of multiscale model combinations of one exemplary geometry based 
on LATs on predefined atrial landmarks. 22 landmarks and the latest activation times in the 


left and right atrium are shown in each row. The clinically measured u 
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ranges are shown with the solid, dashed and dotted line, respectively. LATs simulated for 
accepted and rejected instances are visualized in black and red, respectively. RA = right atrium, 
LA = left atrium, LAT = local activation time 
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Figure 6.7: Statistics of calibrating all 2,000 multiscale model combinations based on LATs. 
The rows show the percentage of simulated LATs inside the clinical reference ranges (from 
top to bottom: u + o, u +20, u +30) for each landmark (yellow dots) represented as a region 
around it on the mean shape geometry. 
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Figure 6.8: Concatenated P waves of the 12-lead ECG. The mean of the clinical dataset is 
shown in red, the [-3, +3]o range of the first, second in third principle component from top 


to bottom row, respectively, is exemplary shaded in grey. 
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Figure 6.9: Lead-wise root mean squared errors (rmse) between simulated and approximated 
signals in a reduced eigenspace derived from PCA on clinical ECGs. The thin solid line shows 
the threshold of 0.1 applied for excluding model instances exhibiting ECG characteristics not 
complying with typical clinical ECG morphology. 


125 


Chapter 6. Population of Multiscale Models 


12-lead ECGs for 20 instances belonging to anatomical model atriaID_070 
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Figure 6.10: 12-lead ECGs calculated on the anatomical model instance atrialD_070 com- 
bined with 20 different CV and ionic model combinations. ECGs that were rejected for the 
final population based on LATs, P wave duration and ECG morphology are visualized in blue, 
cyan and red, respectively. ECGs of multiscale models passing all calibration steps are shown 
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Figure 6.11: Two examples of simulated signals (blue lines) and their least squares recon- 
struction with in the clinical ECG eigenspace. The example in the top panel shows an ECG of 
a model instance that was accepted in the population as € < 0.1, the example in the bottom 
panel was rejected. 
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Simulation of q Large-scale ECG 
Dataset of Healthy Subjects and 
Selected Pathologies 


In this chapter, the simulation of 10,000 12-lead electrocardiogram (ECG) sig- 
nals of 10 seconds length for healthy subjects and patients with selected cardiac 
pathologies is presented. These are of a comparable format to clinically measured 
ECGs and could therefore be directly employed as an extension to clinically 
recorded signals for deep learning applications. 


This study was carried out within the scope of a project supported by the EM- 
PIR programme co-financed by the participating states and from the European 
Union’s Horizon 2020 research and innovation programme under grant Medal- 
Care 18HLT07. Methodology and results were developed, generated and discussed 
in close collaboration with Karli Gillette, Matthias Gsell and Gernot Plank from 
Medical University of Graz, Graz, Austria. Karli Gillette simulated and provided 
the ORST segments for the healthy cohort and all ventricular pathologies. 
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7.1 Introduction 


Unless a four chamber heart model constitutes the underlying geometry for elec- 
trophysiological simulations, basic simulation protocols applied for example 
within the openCARP [98] simulation framework allow for generating single beat 
P waves or QRS(T) complexes and lack the possibility to represent intra-patient 
variability in the ECG over time. This impedes the direct and straight forward 
usage of simulated ECGs as an extension to clinically measured datasets in the 
machine learning context since the simulated single beat ECG waveforms are not 
of acomparable format to clinically recorded ECG time series of at least multiple 
seconds. 

Thus, a synthesization pipeline was developed to stitch P waves and QRST seg- 
ments simulated independently of one another together. The synthesized heartbeat 
was then extended to a 10s ECG time series while accounting for heart rate vari- 
ability within a patient or subject. In this way, a dataset of 10,000 simulated ECGs 
was generated with samples equally distributed among a healthy control group 
and patients with each of the following cardiac pathologies: interatrial conduction 
block (IAB), left atrial enlargement (LAE), arrhythmogenic fibrotic atrial car- 
diomyopathy (FAM), Ist degree atrioventricular block (AVB), right bundle branch 
block (RBBB), left bundle branch block (LBBB) and myocardial infarction (MI). 


7.2 Methods 


7.2.1 Simulation Protocol and Parameter Settings 


The simulation parameters for P wave simulations of the healthy cohort and the 
atrial pathologies (FAM, LAE and IAB) are explained in detail in chapter 8, 
chapter 9 and by Bender et al. [260]. For the ventricular pathologies (LBBB, 
RBBB, MI) and AVB, P waves of the healthy sinus rhythm control group were 
used in the subsequent synthesization process (see section 7.2.2). Parameter 
ranges and modeling methodology for the ventricles are described by Gillette et 
al. [34, 261]. For the healthy control group and each of the pathologies listed 
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above, 1,300 synthetic ECGs were simulated and synthesized, summing up to a 
total of 10,400 in silico ECGs. 

80 geometrical models of the atria underlain the simulation of the healthy 
control group, as well as the FAM and IAB cohorts. They were generated by 
drawing the eigenvector coefficients of the bi-atrial statistical shape model (SSM) 
(see chapter 5) from a Gaussian distribution in the [-30, +30] interval (see 
chapter 8). For the LAE cohort, 45 additional anatomical models were generated 
by optimizing the SSM’s eigenvector coefficients such that the left atrial volumes 
were uniformly distributed between 48 and 65 ml which is explained in greater 
detail in chapter 9. Conduction velocity was assigned to different atrial regions 
and varied within physiological intervals for all pathology classes and the control 
group. Fibrosis was modeled as described in chapter 8, key parameters considered 
in fibrotic remodeling scenarios are summarized in section 8.2.1.2. The atria 
were placed in a torso volume conductor for which geometrical models were 
also derived from a human body SSM [179]. The reference position of the atria 
inside the thorax was defined based on the mean location and orientation of eight 
patient-specific atrial models in their respective torso geometry [141]. The rotation 
angle and translation parameters of the heart within the torso around and along all 
three coordinate axes were varied relative to the reference position in ranges of 
[-15°,+15°] and [15 mm, +15 mm], respectively. The Eikonal model was solved 
with the Fast Iterative (for LAE, FAM and the healthy control group) and the Fast 
Marching method (for IAB) to calculate local activation times (LATs) and based 
thereon, the transmembrane voltage distribution with a Courtemanche et al. [91] 
action potential template as described in section 3.1.2.3. The forward problem 
was solved with the boundary element method (BEM) (for LAE and IAB) and 
the infinite volume conductor method (for FAM and the healthy control group) to 
obtain P waves of the 12-lead ECG. Further details on the simulation protocol and 
parameter settings for the FAM and healthy cohort, the LAE cohort and the IAB 
cohort are elaborated on in chapter 8, chapter 9 and in the study from Bender et 
al. [260]. 


7.2.2 Synthesization of P waves and QRST segments 


The pipeline for synthesizing single beat P waves and QRST segments from 
individual simulation runs is outlined in Figure 7.1. ECG segments of atrial and 
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Figure 7.1: Synthesization pipeline for stitching P waves and QRST segments from separate 
simulations runs together and generate a 10 s ECG time segment. Multivariate normal distri- 
butions (MVDs) were set up using ECG timing and amplitude features extracted from clinical 
signals. In this way, R peak amplitudes were first scaled to match the simulated P wave am- 
plitude, a realistic PQ segment was chosen complying with the simulated P wave duration 
and a mean heart rate was defined based on the QT interval of the simulated QRST segment. 
An RR time series was generated based on a heart rate variability model to concatenate the 
resulting single heartbeat to a 10s signal trace which can optionally be superimposed with 
realistic ECG noise and band-pass filtered using the recommended cut-off frequency settings. 


ventricular activity were generated in separate simulation runs independently from 
one another. Thus, it was not explicitly ensured that heart and torso geometries for 
the resulting atrial and ventricular signals are compatible. In particular, P wave 
and R peak amplitudes were likely to mismatch since also different forward 
calculation methods were employed for atrial and ventricular simulations. Thus, 
an amplitude scaling step marked the beginning of the synthesization process. For 
that purpose, a multivariate normal distribution (MVD) was set up for P wave 
amplitude and R peak amplitude extracted from 918 ECGs of healthy subjects 
in the Physionet Computing in Cardiology 2020 Challenge Database [262] as 
described previously [263]. The resulting distribution is visualized in Figure 7.2. 

From the P wave simulation result to be synthesized with a QRST segment, 
the P wave amplitude in lead II was calculated and in this way, a realistic scaling 
factor for multiplying the respective simulated QRST segment was drawn from 
the MVD. Likewise, an MVD for P wave duration and PQ interval extracted using 
ECGdeli [250] from the same clinical database [262] was constructed to sample a 
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Figure 7.2: Multivariate normal distribution for R peak and P wave amplitudes. The left 
panel shows the probability of specific combinations of R peak and P wave amplitudes in 
lead Il. The right panel represents the same information as a 2D projection with probability 
densities color coded in the same way as in the figure on the left hand side. 


time interval for the PQ segment matching the simulated P wave duration. For 
the AVB pathology class, the PQ segment was constrained to be sampled so that 
the PR interval quantified to at least 200 ms. The scaled QRST segment could be 
stitched together with the simulated P wave by inserting a sigmoid function of a 
length determined by the PQ segment in between. This single simulated heartbeat 
was then further processed and extended to a 10 s ECG time series as explained in 
section 7.2.3. 


7.2.3 Realizing Beat-to-beat Variability Through a Heart 
Rate Variability Model 


Building on an MVD of QT intervals and mean RR intervals, the mean heart 
rate of the simulated and synthesized ECG signal was determined based on 
the QT interval extracted from the ventricular simulation outcome. This mean 
RR interval was used to generate a time series of RR intervals with the heart rate 
variability model proposed by Kantelhardt et al. [264]. Their model considers 
transient correlation of successive heartbeats and stochastic variation underlying 
the regulation and change in heart rate over time. By scaling the QRST segment 
so that the QT interval matches the single RR interval for every heartbeat and 
adding a sigmoid TP segment to fill up the remaining ECG samples in between 
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two beats, the 10 s time series of the ECG was constructed. This was subsequently 
superimposed with realistic ECG noise [265] and filtered with a low- and high-pass 
filter with their cut-off frequencies set to 150 Hz and 0.5 Hz, respectively. 


7.2.4 Assessing Fidelity of Simulated Signals 


6 timing features (P wave duration, PR interval, QRS duration, QT interval, 
T wave duration and RR interval) and 5 amplitude features in each lead (P wave 
amplitude, Q-, R- and S peak amplitudes, T wave amplitudes) were extracted from 
the resulting simulated database comprising 1,300 signals per pathology class 
or healthy control group. The same features were extracted from the PTB XL 
database [16]. This dataset has neither been used before in any calibration step 
preceding the P wave and QRST simulations in this study nor in the generation of 
the MVDs and could thus serve as an independent validation resource to assess 
fidelity of the simulated signals by comparing the resulting feature distributions. 


7.3 Results 


Example ECGs of the in silico cohorts for the healthy control group (red curves) 
and the atrial pathologies (blue curves) are shown in Figure 7.3. Examples for 
ventricular pathological signals are to be found in appendix B in Figure B.1. Simu- 
lated signals exhibit P wave alterations compared to the healthy case characteristic 
for the specific disease: In LAE signals, the overall P wave duration is slightly 
prolonged and also P wave terminal force, calculated in clinical practice as the 
product of interval and amplitude of the negative deflection in lead V1, is increased 
compared to the healthy control group. Also in IAB signals, P wave duration is 
increased. Additionally, a change in morphology, polarity and amplitude is visible 
especially in lead III and aVL. Signal variations in FAM ECGs are more subtle 
and comprise a marginal increase of P wave duration and a small decrease in 
P wave amplitude, mostly visible in the anterolateral leads V5 and V6. 

The feature distributions for all timing features and the amplitude features 
in lead II for the simulated and clinical healthy control group are depicted in 
Figure 7.4. It gets apparent from Figure 7.4, that the features extracted from the 
simulated cohort lie within the ranges of those extracted from clinical signals. 
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Figure 7.3: Example 12-lead P waves for the atrial pathology classes. Red and blue curves 
represent simulated signals of the healthy control and the LAE, IAB and FAM pathologies from 
top to bottom row. 


Large deviations, however, are visible in the T wave amplitude that ranged until 
1 mV in the simulated cohort, whereas the largest values in the clinical signals did 
not exceed 0.5 mV. 

In Figure 7.5, the distributions of characteristic features used for a diagnosis 
of each atrial disease are compared to the respective distribution of the healthy 
control group. It is evident, that the trend in feature shifts between healthy control 
and the pathological cases is consistent between the simulated and clinical LAE 
cohorts. In the IAB cohort, amplitudes in the lateral limb leads tend to be smaller 
compared to the respective quantities in the healthy control group. Characteristic 
features for the FAM class comprise P wave amplitude in the anterolateral leads 
that statistically exhibit smaller values than the ones extractable from the healthy 
cohort. However, for FAM and IAB, no clinical signals were available in PTB XL. 
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Figure 7.4: Feature distributions for the healthy control group. Red and blue curves repre- 
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A comparison with measured P wave features from another clinical data source 
is shown in detail for the FAM class in section 8.3.2. A quantitative validation 
of IAB P wave timing, amplitude and morphology features was carried out by 
Bender et al. [260] (see section 7.4). Feature distributions for the AVB class and 
the ventricular pathologies are shown in appendix B in Figure B.2. 
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Figure 7.5: Feature distributions for the atrial pathologies compared to the healthy control 
group. Red and blue curves represent the probability density of the extracted feature values 


from the simulated and clinical ECGs, respectively. Dashed and solid lines show the distribu- 


tions of the pathologies (from left to right: LAE, IAB, FAM) and healthy control, respectively. 
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7.4 Discussion 


In the study presented in this chapter, a large scale dataset of 10,000 simulated 
12-lead ECGs each of 10s length was compiled and validated against clinical 
signals. The synthetic database will be published and made openly available to 
the community together with a complete list of parameters underlying the gen- 
eration of each signal. Besides a healthy control group, pathologies covered in 
this in silico dataset comprise FAM, LAE and IAB as examples for atrial diseases, 
AVB as an example for atrioventricular conduction disorders, as well as LBBB, 
RBBB and MI as examples for ventricular pathologies. These pathologies reflect 
in altered ECG signal characteristics of different degrees. Thus, the generated 
ECGs are particularly suitable as a benchmark dataset to develop machine learning 
algorithms for automated signal analysis and arrhythmia classification, even for 
difficult classification tasks requiring the detection of subtle signal variations. 
The simulated dataset fulfills important properties essential for developing such 
ECG-based computer assisted diagnostic tools: It contains a large number of 
signals in contrast to most data sources of clinically recorded ECGs [262], the 
number of samples per pathology class is equally distributed and the ground truth 
of the underlying pathology did not need to be annotated by cardiologists, but 
was instead defined by parameter changes in the underlying simulation run and 
are thus precisely known. As such, the presented dataset can be employed for 
transfer learning tasks, i.e., to pre-train a machine learning classifier on simulated 
data and fine-tuning it using the actual clinical ECGs. In this way, the learning 
process can be considerably expedited and streamlined, as the network can build 
on prior knowledge and experience instead of learning dependencies in the input 
data from scratch. However, an important prerequisite for successful pre-training 
of a network is a small domain gap between the dataset for initializing the net- 
work’s weights and the actual data the network’s performance should be optimized 
for. Therefore, an explicit validation step ensuring that the in silico ECGs are 
representative and realistic compared to clinically measured signals is indispens- 
able. Moreover, the presented database allows for full traceability of altered ECG 
characteristics with regard to underlying cellular, anatomical and tissue properties, 
as parameter sets defined for each simulation run are are available for each ECG 
signal. 
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In the study presented in this chapter, the validation of the simulated ECGs was 
performed by annotating fiducial points in the 12-lead signal, delineating single 
waveforms and extracting common timing and amplitude features therefrom. The 
comparison with clinical ECG features extracted in the same way showed that 
simulated ECG features lied mostly within the range of the clinical intervals 
in the healthy case. However, simulated T waves showed too high amplitudes 
compared to clinical feature statistics, most likely having occurred due to too high 
action potential duration gradients in apico-basal direction mapped onto the Tejose 
parameter in the Mitchell-Schaeffer ionic model [266]. The comparison of feature 
statistics also shows that the simulated data do not cover the full range of the 
feature variation in the clinical case. Especially RR intervals exhibited a much 
more narrow distribution around a mean of 750 ms compared to clinical features. 
However, the mean RR interval in the in silico dataset was defined using an MVD 
based on the simulated QT interval, for which the feature distribution was also 
smaller and shifted to smaller time values compared to the clinical case. It can thus 
be concluded that the simulated feature statistics are consistent in themselves and 
do not exhibit unrealistic feature combinations within the same signal. Simulated 
signals not covering the entire range of clinically measurable feature values might 
have occurred due to insufficient simulation parameter variation not exploiting the 
full physiological ranges. 

In the pathological cases, the shift and change of feature statistics compared 
to the healthy control case was compared among simulated and clinical data. For 
all pathologies where corresponding data was available in the clinical PTB XL 
dataset, the changes in feature statistics were consistent between in silico and 
clinically measured ECGs (compare Figure 7.5 and Figure B.2). PTB XL lacked 
the respective signals for the FAM and the IAB pathology class, inhibiting a 
similar approach for the ECGs in these classes. However, a detailed comparison 
of simulated and clinical P wave features for the FAM class was carried out using 
a different clinical datset in chapter 8. The change in P wave amplitude in V6 as 
shown in Figure 7.5 (left panel) can be attributed to the chosen fibrosis remodeling 
methodology, as 50 % of the elements in fibrotic patches were modeled as passive 
conduction barriers not contributing to the overall source distribution in the heart 
and thus causing a decreased P wave amplitude on the body surface. Bender et 
al. [260] presented the generation and validation of simulated P waves for the 
IAB pathology class and discussed the resulting feature variations in detail. They 
found that P wave amplitude variation in lead aVL as visible in Figure 7.5 (middle 
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panel) arose specifically due to conduction blockage in Bachmann’s bundle and 
the accompanying retrograde activation of the left atrium. 

The 10s time series of synthetic ECGs were generated by concatenating 
P waves and QRST segments simulated independently from one another. Different 
methods for solving the forward problem of electrocardiography were employed 
for the atrial and ventricular simulations and moreover, different torso volume 
conductors were used in the simulation runs of atrial and ventricular signals which 
were later concatenated to the ECG time series. Even though an amplitude scaling 
step was introduced to prevent the generation of an ECG heartbeat exhibiting 
non-matching amplitudes of P waves and QRS complexes, it is not ensured that 
the chosen atrial and ventricular anatomical and electrophysiological parameters 
underlying the generation of one specific ECG signal are compatible. 

Anzlinger [267] presented a different approach for synthesizing single beat 
P waves and QRST segments to a time series of 10s ECGs. They extracted intra- 
patient variability in various ECG features from clinical data and sampled scaling 
factors for amplitude and time intervals to vary single P waves and QRST segments 
to be concatenated to the resulting time series signal. However, this approach 
lacks the possibility of representing correlating RR intervals of subsequent beats. 
Furthermore, mutually dependent ECG features, e.g. QT- and RR intervals, were 
not accounted for, potentially leading to inconsistent ECG feature distributions 
within the same patient or subject. Therefore, the heart rate variability model 
developed by Kantelhardt [264] was implemented for this study to generate 
realistic RR interval time series and MVD feature distributions were generated 
based on an independent clinical ECG dataset to take dependencies of intra-patient 
feature variations into account. 
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NON-INVASIVE RISK 
STRATIFICATION OF ATRIAL 
FIBRILLATION THROUGH 
MACHINE LEARNING AND 
IN SILICO ECG DATA 


Chapter 8 


Estimation of Left Atrial Fibrosis 


In this chapter, the capability of neural networks applied to simulated and clin- 
ical P waves to estimate the volume fraction of fibrotic substrate in the atria is 
examined. In this way, the potential of the 12-lead electrocardiogram (ECG) as a 
non-invasive and cost-effective tool for the early detection of fibrotic atrial car- 
diomyopathy can be gauged to ultimately stratify atrial fibrillation (AF) propensity. 


The content of this chapter is taken from a paper and two conference proceedings 
that have all been published open access under licence CC-BY 4.0 in Journal of 
Clinical Medicine [122], Current Directions in Biomedical Engineering [268], 
and Computing in Cardiology [269]. Most passages in this chapter have been 
quoted verbatim from the publications. 


8.1 Introduction 


The clinical picture of AF is characterized by disorganized reentrant waves travers- 
ing the atrial tissue and causing an irregular heart rhythm (see section 2.3). The 
maintenance of the arrhythmia requires either sustained rapid ectopic foci firing 
or a single ectopic focus acting as an AF driver in regions of vulnerable atrial 
substrate. Hence, fibrotic tissue having undergone structural and electrical remod- 
eling processes provides the necessary substrate to contribute to the perpetuation 
of AF [270, 271]. Disease mechanisms contributing to fibrotic remodeling of 
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atrial tissue include an increased deposition of collagen strands and other extracel- 
lular matrix proteins in the interstitial space. The accumulation of collagenous 
septa implies the separation and electrical decoupling of myocytes and thus re- 
stricts the electrical depolarization wave to propagate via alternate conduction 
pathways. With increased conduction anisotropy, slowed conduction due to down- 
regulated gap Junction proteins, and the formation of unidirectional conduction 
blocks, fibrotic patches provide an arrhythmogenic substrate for the initiation and 
maintenance of functional and anatomical re-entry patterns. 

Quantifying the amount of these arrhythmogenic substrate areas could thus 
be an important means for individual risk stratification of new-onset AF and 
arrhythmogenic fibrotic atrial cardiomyopathy (FAM) [272, 273]. Moreover, 
it could serve as a basis for suggesting susceptible patients for AF screening, 
choosing an appropriate treatment strategy, and reducing the risk of accompanying 
co-morbidities [81]. 

Catheter ablation is a common treatment option in clinical practice for the 
purpose of blocking certain conduction pathways in the atria that are suspected to 
contribute to the onset and maintenance of the arrhythmia. However, the optimal 
choice of these ablation targets beyond pulmonary vein isolation is challenging 
and calls for a personalized approach. AF recurrence rates between 20 and 60% 
in patients with large arrhythmogenic substrate areas in the atria [274] underline 
the need to tailor therapy to the substrate [275-278]. Accordingly, scarring 
of additional targets in those patients was suggested to improve the ablation 
outcome [31, 279-281]. Quantifying the amount of atrial fibrosis non-invasively 
using the ECG could thus contribute to select patients suitable for ablation and 
contribute to a more effective treatment of AF [279]. 

High intensity areas on late Gadolinium enhancement magnetic resonance 
imagings (LGE-MRIs) as well as low bi- and unipolar electrogram voltages [80, 
216, 282] recorded during electrophysiological studies are currently drawn on to 
identify the amount and spatial location of scars and fibrosis on the atrial endocar- 
dial wall. However, LGE-MRI is a cost-intensive imaging technique for which 
technical parameters have to be carefully selected. Additionally, segmentation of 
magnetic resonance (MR) images is cumbersome and challenging [201]. Further- 
more, electroanatomical mapping is an invasive and time-consuming procedure 
with inter-electrode spacing and electrode sizes being additional confounding 
factors influencing electrogram voltages [283]. 
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To circumvent these shortcomings, the 12-lead ECG as a commonly available 
and easily usable tool in clinical practice could feature a way to quantify the 
fibrotic left atrial volume fraction and in consequence predict the propensity to 
future AF incidents [284]. The duration of the P wave in the 12-lead ECG has 
been shown to correlate with the amount of left atrial (LA) low-voltage substrate 
in AF patients if a low threshold for determining P wave on- and offsets are 
chosen [273, 285-287]. P wave terminal force in V1 (PTF V1) has been identified 
as sensitive to a change in conduction velocity in the atria [176], which is typically 
caused by fibrotic substrate in AF patients. Also P wave dispersion, i.e., the 
difference between latest and earliest detectable P wave offset across all 12 ECG 
leads, has been shown to be a measure for locally heterogeneous conducting 
tissue regions [272]. Other ECG-based predictors for the onset of AF and for the 
presence of low-voltage areas include P wave amplitude [288, 289], P wave area 
in V3 [290] and root mean squared (rms) voltage in the terminal P wave signal 
parts [291]. 

However, these P wave-derived features are not only affected by fibrotic infil- 
tration of atrial tissue. P wave duration (PWd) and dispersion are also influenced 
by changes in conduction velocity and atrial anatomy [169] that both vary between 
different subjects within healthy reference ranges [215, 292, 293]. Furthermore, 
PTF V1 is highly dependent on the placement of the electrodes on the thorax [294]. 
Lastly, different thoracic geometries yield different P wave amplitudes, which 
might additionally be impaired by loose electrode contact. 

Therefore, the aim of the study presented in this chapter was to quantify 
which changes in P wave morphology, amplitude and duration can be attributed 
specifically to fibrotic infiltration of atrial tissue. Furthermore, it was investigated 
if these effects on the P wave can be separated from confounding changes induced 
by healthy anatomical variation of the atria and the thorax as well as different 
electrode positions. For that purpose, various anatomical model combinations 
were composed constituting of different geometries for atria and torsos as well as 
different orientation angles of the atria within the torso. Electrophysiological sim- 
ulations with inclusion of fibrotic regions in the atria of different volume fraction 
were conducted in sinus rhythm wherefore a set of different baseline conduction 
velocitys (CVs) in healthy reference ranges were imposed to also account for 
functional variation within the virtual cohort. Subsequently, changes in P wave 
features caused by geometry and rotation angle variations (section 8.3.1) were 
compared to those caused by local atrial substrate modifications (section 8.3.2). 
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The results thereof were employed to estimate the fibrotic LA volume fraction 
from simulated P wave features with neural networks in section 8.2.4. 

For a successful clinical translation of the proposed ECG-based fibrosis es- 
timation method, the network needs to fulfill the requirement of low sensitivity 
to inaccurately extracted feature values which is addressed in section 8.3.3.4. 
This is because the proposed regression model would rely on features extracted 
from simulated P waves of the 12-lead ECG. Those features might be easily and 
robustly extractable from noise-free simulated signals, but their values are subject 
to several disturbances in the clinical use case. It was therefore quantified how sen- 
sitive the network would be towards inaccurately determined feature values and to 
what extent the network’s estimation of the fibrotic volume fraction would still be 
reliable if the feature values were superimposed with noise (see section 8.3.3.4). 

Finally, to investigate the expediency and applicability of the proposed meth- 
ods in a realistic clinical use case, the developed network is applied to measured 
clinical ECGs (see section 8.2.2). 


8.2 Methods 


8.2.1 In silico Database 


8.2.1.1 Virtual Patient Cohort 


In order to generate a large-scale dataset of P waves representing a virtual patient 
cohort characterized by different anatomical properties, various atrial and thoracic 
geometries were derived from statistical shape models (SSMs). The bi-atrial 
SSM described in detail in chapter 5 [169] was used to generate 80 random 
volumetric instances of the atria augmented with homogeneous wall thickness, 
rule-based fiber orientation [213], tags for anatomical structures and inter-atrial 
connections [200]. An exemplary instance is shown in Figure 5.5. 

Furthermore, 25 thoracic geometries were generated by varying the two lead- 
ing eigenvectors of the model developed by Pishchulin er al. [179] systematically 
in the [—2,2]o range. The first two eigenvectors account for approximately 80% 
of the total shape variance. The first two eigenmodes, i.e., the shape variability 
resulting from a variation of only the respective eigenvector, are visualized in 
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Figure 8.1. A variation of both of them reflects in a change of the torso size in 
superior-inferior direction and in anterior-posterior direction in both the chest and 
the abdominal regions. In this way, height, weight and gender variation could be 
accounted for in the virtual cohort. 
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Figure 8.1: Representation of eigenmodes of the upper body statistical shape model. The 
first eigenmode (top row) reflects in a change of the torso size predominantly in the abdominal 
region, the second one (bottom row) in the chest region. 


Each of the 80 atrial geometries were rotated by -20°, 0° and +20° around 
the x-, y- and z-axes [295] leading to 27 permutations of different orientation 
angles for each individual atrial geometry. Combinations of every rotated atrial 
geometry placed in each of the 25 thoracic geometries were realized and thus a set 
of 54,000 anatomical models were obtained. The different model and parameter 
combinations are illustrated in Figure 8.2. For each of these combinations, electro- 
physiological simulations were conducted in sinus rhythm and the 12-lead ECG 
was extracted at the standardized electrode positions. The sinus rhythm simula- 
tions were repeated with electrical and structural remodeling of different degrees 
for all model combinations. For that purpose, the ionic and tissue parameters were 
modified as described in section 8.2.1.2 in selected tissue patches covering 0%, 
5%, 10%, 15%, 20%, 25%, 30%, 35%, 40% and 45% of the total LA myocardial 
volume. In this way, the virtual cohort comprised a total of 540,000 (80 atria 
geometries x 25 torso geometries x 27 rotation angles x 10 volume fractions 
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Figure 8.2: Representation of the different model combinations. A total of 642,400 P waves 


were simulated for a virtual patient cohort characterized by anatomical and functional vari- 
ability as well as fibrosis covering different volume fractions of the atrial tissue. 


covered by fibrosis) anatomical model combinations of healthy subjects and FAM 
patients. These were subject to the electrophyiological simulations described 
in section 8.2.1.3 to finally obtain 642,400 12-lead ECG signals evaluated and 
processed for the analysis described in sections 8.2.3 and 8.2.4. 


8.2.1.2 Modeling Methodology of Fibrotic Tissue 
For each atrial model, variants with fibrosis covering 0%, 5%, 10%, 15%, 20%, 
25%, 30%, 35%, 40% and 45% of the total LA myocardium volume were created. 


Subsequently, the volume fraction of right atrial fibrosis was defined for each case 
according to the findings from Akoum et al. [296] (Table 8.1). 
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Table 8.1: Volume fraction of fibrosis in the right atrium for each Utah stage as reported by 
Akoum et al. [296]. 


Utah Stage | Fibrotic LA volume fraction | Fibrotic RA volume fraction 


Utah | 0-5% 1.27 + 0.38% 
Utah II 5 - 20% 4.65 + 0.70% 
Utah III 20 - 35% 9.40 + 2.16% 
Utah IV >35% 12.66 + 3.0% 


Considering the patchiness of fibrosis observed in AF patients [297], several 
disconnected patches on the atrial surface were defined as fibrotic accumulating 
to the total left atrium (LA) volume fraction of fibrosis. Each of these individual 
fibrotic patches was defined by a center seed point and a radius around it. The 
total number of seed points and the sizes of the radii were chosen depending on 
the total volume fraction of fibrosis to be covered. The positions of the seed points 
for the patchy fibrotic regions were defined by taking the spatial distribution of 
fibrotic atrial substrate as reported by Higuchi er al. [145] for the left and Akoum 
et al. [296] for the right atrium into account. Radii randomly chosen in a range of 
[3, 6] mm around these seed points determined the candidate regions of fibrosis 
in the atria. To not only account for the patchy nature of fibrotic tissue, but also 
for its diffuse appearance, around 80% of the cells within the candidate regions 
defined above were assigned to the fibrotic substrate (compare Figure 8.3 (middle 
and left column)). In these substrate regions, the simulation parameters were 
adjusted as described in the following to account for structural and electrical 
fibrotic remodeling. 

Fibrosis infiltrating the regular myocyte tissue structure cause adjacent my- 
ocytes to be electrically decoupled and thus act as passive barriers to the prop- 
agating wavefront. The concept of percolation [128] was drawn on to account 
for this phenomenon in the simulations. 50% of the cells within the fibrotic 
regions were therefore defined as belonging to the extracellular matrix. Hence, 
these cells impair the normally straight conduction across the tissue and constrain 
the intracellular depolarization wave to pass around the fibrotic barriers [81]. In 
the remaining 50% of the cells belonging to the fibrotic regions, maximal ionic 
conductances were rescaled as suggested by Roney et al. [82] to account for 
cytokine-induced remodeling (50% gx,, 60% gna, 50% gcar) [48]. Furthermore, 
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d Definition of seed N (— Allocation of fibrotic N Repetition of the N 


points and candidate elements to 80% of the procedure in remaining 
fibrotic patches in each cells in the candidate atrial regions 
atrial region regions 


Figure 8.3: Definition of the spatial distribution of fibrotic tissue. The atrial geometry was 
first separated into 6 subregions for the left and 2 subregions for the right atrium as reported 
by Akoum et al. [296] and Higuchi et al. [145] indicated by the black separation lines. The 
stage of fibrosis to be modeled was then set (15% in this example) and the number of seed 
points and radii around them were chosen pseudo-randomly by ensuring that within each 
of these subregions, the total volume of fibrotic elements accumulated to the spatial fibrosis 
distribution found in clinical studies [145, 296]. 80% of the cells in these candidate regions 
were defined as fibrotic (middle column) and this process was repeated for all subregions (right 
column). 


conduction velocities were reduced by 80% in transversal and 50% in longitudinal 
direction, which in turn caused anisotropy ratios to be increased by a factor of 2.5. 
In this way, local CV heterogeneities and anisotropic wavefront propagation was 
accounted for facilitating functional reentry in AF patients [298]. 


8.2.1.3 Electrophysiological Simulations 


For each atrial model and volume fraction covered by fibrosis, simulations were 
performed by solving the anisotropic Eikonal equation with the fast iterative 
method [158]. For all simulations, excitation was initiated from a sinus node 
trigger located at the junction of the superior caval vein and the right atrial 
appendage [149]. The atrial wall was separated into five different anatomical 
regions: bulk right and left atrium, inter-atrial connections, pectinate muscles, 
crista terminalis and inferior isthmus. For each of these regions, the anisotropy 
ratio and baseline (B) conduction velocity in transversal (L) fiber direction CV, p 
were chosen as reported previously [149] (Table 8.2). For simulations involving 
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fibrotic tissue areas, two additional anatomical regions were included as described 
in section 8.2.1.2: Non-conductive elements were characterized by a conduction 
velocity of CV, = Omm/s and CV was reduced by 80% in slow conducting 
cells. Anisotropy ratio in slow conducting tissue was increased by a factor of 2.5 
compared to baseline. 


Table 8.2: Conduction velocities in transversal fiber direction and anisotropy ratios. 


Tissue region CV ginm/s | Anisotropy ratio 
Bulk right and left atrium 0.591 2.090 
Crista terminalis 0.591 2.843 
Pectinate muscles 0.461 3.780 
Inter-atrial connections 0.645 3.339 
Inferior isthmus 0.540 1 
Fibrosis (non-conductive) 0 NA 
Fibrosis (slow conducting) 0.2. CV 2.5. AR 


To additionally account for functional variability within the virtual cohort, 1-3 
different CV values in the interval [-20 %, +20 %] - CV | p were sampled in each 
region assuming a uniform distribution for each of the 540,000 anatomical model 
combinations described in section 8.2.1.1. 

By solving the Eikonal equation, the spread of electrical activation in sinus 
rhythm was computed and local activation times (LATs) were obtained at each 
node. By shifting a Courtemanche et al. [91] action potential template in time 
according to the calculated LATs as proposed by Kahlmann et al. [299], the 
transmembrane voltage distribution on the atria was obtained. A remodeled 
Courtemanche action potential as described in section 8.2.1.2 was used for cells 
in slow conducting fibrotic areas representing cytokine-induced remodeling. 

For each model combination explained in section 8.2.1, the atria were as- 
sumed to be embedded in an infinite volume conductor of conductivity o, = 
0.2 S/m [139]. The extracellular potentials were extracted at the respective elec- 
trode positions [300] and used to derive the P wave of the standard 12-lead ECG. 
For analyzing the influence of the V1 electrode position on PTF V1, the 12-lead 
ECG was also extracted from the electrophysiological simulations with the refer- 
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ence atria and torso geometry for varying positions of the V1 electrode. The latter 
was varied within a radius of 6 cm around the standard V1 electrode position. 

The raw ECGs simulated in this way were subject to feature extraction (see 
section 8.2.3) to systematically evaluate the influence of healthy anatomical, func- 
tional and pathological variation on ECG features (see sections 8.3.1 and 8.3.2) 
and assess whether these features carry diagnostically relevant information to esti- 
mate the LA volume fraction of fibrosis with neural networks (see section 8.3.3). 
To systematically investigate the influence of inaccurately extracted features, Gaus- 
sian noise was added directly to the extracted feature values as described in detail 
in section 8.3.3.4. For the translational study described in section 8.2.2, realistic 
ECG noise as generated by Petrenas et al. [265] was added to the simulated signals 
before applying high- and lowpass filters with cut-off frequencies of 0.5 Hz and 
150 Hz as recommended by Kligfield er al. [301], respectively. 


8.2.2 Clinical Database 


From 27 AF patients who underwent an electroanatomical mapping procedure at 
Städtisches Klinikum Karlsruhe and University Hospital Essen, bipolar electro- 
grams as well as 12-lead ECGs in sensed paced rhythm were recorded (see Fig- 
ure 8.4, left panel). The patients’ age ranged between 49 and 84 years (67.19+8.64 
years) and 11 out of the 27 patients were women. Patients provided informed con- 
sent and the study was approved by the institutional review boards. Detecting the 


activity at the pacing site from the electrograms allowed to select time windows 
of normal sinus rhythm activity in the ECG traces (green highlighted intervals in 
Figure 8.4). ECGdeli [250] was applied to automatically delineate the P waves in 
the 12-lead ECGs in sinus rhythm. Furthermore, the fraction of fibrotic substrate 
in the left atrial endocardium was calculated for each patient by identifying the 
areas where bipolar peak-to-peak voltage in the intracardiac electrograms was 
below 0.5 mV. P waves were also delineated using ECGdeli in the 12-lead sinus 
rhythm ECGs of 7,185 healthy subjects from the public PTB-XL dataset [16] (see 
Figure 8.4, right panel). 

In this way, a total of 68,282 single clinical P waves from 7,185 subjects in 
the control group and 42,227 single P waves from 27 patients with an extent of 
fibrotic left atrial areas between 7.05 % and 77.28 % were used as clinical input 
for the machine learning classifiers. 
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Figure 8.4: Clinical ECG data from 27 AF patients and 7,185 healthy individuals applied for 
clinical translation of the fibrosis estimation methods. Time periods of normal sinus rhythm 
activity were extracted from the ECG recordings in sensed paced rhythm (green intervals in 
the left panel) and ground truth fibrotic volume fraction was calculated as the surface area 
fraction where bipolar voltage was below a threshold of 0.5mV. The delineation tools from 
ECGdeli allowed to extract P waves from the AF and healthy ECG recordings which were 
subsequently fed into the feature extraction pipeline (right panel). 


8.2.3 ECG Analysis and Feature Extraction 


For each of the resulting 642,400 simulated and 110,509 clinical P waves of 
the 12-lead ECG, the following features were calculated: duration, dispersion, 
terminal force in V1, peak-to-peak amplitude in each lead, P wave integral as well 
as the rms voltage in the entire signal and terminal 20, 30 and 40 ms of the ECG 
signal trace. These features have been shown to correlate with the presence of 
fibrotic atrial tissue in previous work [176, 273, 288-291] and are visualized in 
Figure 8.5. 
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In some of these previous studies, it was also shown that not only fibrosis but 
also antomical and functional variability prevailing in a healthy cohort have an 
impact on ECG features. Therefore, the network performance was anticipated 
to decrease for samples with extraordinary high and low LA volumes. Thus, 5 
additional non-invasive features representing LA and RA volume, torso volume 
and torso diameter in anterior-posterior direction in the chest and the abdominal 
region were optionally included when training the network. 

PWd was calculated as the time difference between the latest detectable P wave 
offset and the earliest detectable P wave onset across all 12 leads. For that pur- 
pose, the P wave onset and offset was retrieved for each channel individually. In 
simulated signals, only the actual electrical activity originating from the depo- 
larization of the atria reflects in the simulated P waves and no interfering noise 
sources superimpose the simulation outcome. Therefore, the P wave endings and 
beginnings were annotated with simple thresholds defined above the isolelectric 
line for the simulated cohort. ECGdeli [250] annotations for retrieving P wave 
on- and offsets had to be relied on for the clinical dataset. P wave dispersion was 
subsequently derived by computing the time difference between the latest and 
earliest detected P wave ending across all 12 leads. To calculate PTF V1, the 
time difference between the detected P wave ending in V1 and the signal crossing 
the isolelectric line between positive and negative deflection was multiplied with 
the minimum amplitude in V1. The peak-to-peak amplitudes were obtained by 
subtracting the minimum from the maximum P wave signal value in all 12 leads 
individually. P wave integral in each lead was approximated with the trapezoidal 
rule. The root mean square voltages were calculated as the square root of the 
accumulated squared voltage values in the respective time interval extending in 
negative time direction from the individually detected P wave offset in each lead. 
Features that were extracted from only selected or all leads at once are visualized 
in Figure 8.5 (left panel), along with features that were extracted for each of the 
12 leads individually (right panel) and anatomical measures (bottom panel). 
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(a) P wave features extracted from all leads at once or only for selected leads (left) and from 
all leads separately (right). 
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(b) Features for anatomical measures of the atria (left) and the torso (right). 
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Figure 8.5: Feature extraction. The top panel shows the P wave derived features (purple) that 
were extracted from the simulated and clinical ECGs. Blue marked samples represent char- 
acteristic bounds automatically detected in the ECG trace as a basis to compute the P wave 
derived features. The bottom panel shows the additional features representing anatomical 
atrial and thoracic measures. 
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8.2.4 Regression using Neural Networks 


To evaluate whether and to what extent the influence of healthy inter-individual 
anatomical variability on the P wave can be separated from changes caused by 
atrial fibrosis, a regression neural network was set up. The built-in function 
fitting neural network in MATLAB was used with 2 hidden layers comprising 
10 and 5 neurons each. The network’s dimensions were increased compared to 
the algorithms applied in Nagel er al. [122] to account for the increased problem 
complexity arising from providing 60 additional features as input parameters to the 
network. In total, 75 P wave-derived features described in detail in section 8.2.3 
and optionally 5 additional features for anatomical measures could serve as input 
for the network: 


° P wave features extracted from only selected or all leads at once 
- P wave terminal force in V1 
- P wave duration 
- P wave dispersion 


° P wave features extracted for all 12 leads individually 
- P wave integral 
- root mean squared voltage in terminal 40 ms 
- root mean squared voltage in terminal 30 ms 
- root mean squared voltage in terminal 20 ms 
- root mean squared voltage in the entire signal 
- peak-to-peak amplitudes 


° Anatomical measures 

- left atrial volume 
right atrial volume 
- torso volume 


torso diameter in anterior-posterior direction in the abdominal region 
torso diameter in anterior-posterior direction in the chest region 
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Bayesian regularization was chosen as a training algorithm to predict the volume 
fraction of LA fibrosis as an output. To evaluate the network’s performance, 
6-fold cross validation was applied wherefore P wave feature data were split into 6 
subsets. P waves generated with one specific atrial geometry (in case of simulated 
data) or extracted from the ECG of one specific patient (in case of clinical data) 
were never assigned to different training and testing sets in one split but instead 
kept all in the same set (pseudo-random split’) as proposed by Luongo er al. [228]. 
This procedure ensured that the network is blinded to previously unseen atrial 
geometries and patient data during testing and does not link P wave features in 
the test set to nearly the same feature values seen in the training set caused, e.g., 
only by a slight rotation of the atria in case of simulated data. Luongo et al. [228] 
also found that excluding thoracic instead of atrial geometries during training still 
leads to good generalization results, which is why it was decided to allocate the 
pseudo-random splits using certain atrial geometries exclusively during testing. 

For each split, the simulated and clinical data were divided into 6 sets by 
combining clinical P wave features of 1796-1797 healthy subjects and 4-5 AF 
patients with features extracted from the simulated dataset generated based on 
13-14 different atrial geometries. Thus, the entire dataset was divided into groups 
of 67%, 17% and 17% for training, validation, and testing, respectively. 6-fold 
cross-validation was performed by employing one of these subsets at a time as a 
test set and using the remaining 5 sets for training and validation. Through this 
procedure, it was ensured that P wave features extracted from one patient are only 
included in either of the training, validation or test set, and that the testing of the 
trained network is only performed based on the P waves from patients the network 
has never seen before. 

The network was trained 10 times for each of the 6 pseudo-random train- 
validation-test configurations for a maximum of 1000 epochs each. Its perfor- 
mance was assessed by taking the mean of the absolute root mean squared error 
(rmse) between the predicted and the actual volume fraction of LA fibrosis among 
all 10 training iterations. 

The network was trained and tested using different input feature and data 
source configurations to shed light on the following research questions: 
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RESEARCH QUESTIONS 


A: Does the network benefit from being provided simulated 
data as an additional input data source when estimating the 
atrial fibrotic extent with clinical ECGs? 


B: Does the addition of realistic ECG noise and the 
application of different low-pass filter settings on simulated 
training data have an impact on the network performance 


when trained on a hybrid dataset? 


C: Do anatomical measures of the atria and the torso 
complementing P wave-derived features as additional input 
data contribute to an improved network performance? 


D: To what extent is the network’s estimation of fi- 
brotic volume fraction still reliable if input feature values are 
corrupted by noise? 


To quantify the added value that simulated data can contribute to improving 
the estimated fibrotic volume fraction (A), the machine learning regression model 
outcome was compared when trained once by only using features from the clinical 
ECGs and once by additionally providing simulated data during training. 

The influence of filter settings and the preceding superposition of simulated 
data with realistic ECG noise (B) was assessed by comparing the network output 
on the same clinical test set when either trained with only clinical data or with a 
hybrid dataset for which the simulated data were subject to different preprocessing 
steps (noise-free and without filtering, addition of noise and 150 Hz low-pass 
filtering, addition of noise and 40 Hz low-pass filterine). 

The impact of additional input data consisting of anatomical measures (C) on 
the network performance was evaluated by training the network once by only using 
simulated P wave-derived features and once by extending these simulated features 
with the anatomical measures from the geometries that were employed for the 
respective simulation run. Building on the analysis described in section 8.3.1, it 
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could be reasonable to assume that specific P wave features are highly dependent 
on the anatomical properties of the (virtual) patient and thus, the network’s 
performance would decrease systematically for samples with extraordinary high 
and low atrial volumes or anatomical torso measures. It was therefore evaluated 
if a potential systematic under- and overestimation for data samples of low and 
high atrial and torso sizes can be avoided and compensated for by additionally 
providing features representing anatomical measures as input to the network. As 
values for the anatomical measures were not available for the clinical ECG data, 
the analysis was carried out on simulated data only. 

An important aspect for a successful clinical translation consists of the net- 
work’s sensitivity towards inaccurately determined feature values. Although the 
ECG features are easily and robustly extractable from noise-free simulated signals, 
their values are subject to several disturbances in the clinical use case. Especially 
determining PWd is prone to errors since sensitive thresholds are necessary to 
accurately capture all signal parts belonging to the ECG [122]. Also measur- 
ing the volume of the atria, potentially serving as an additional input feature 
quantifiable non-invasively via echocardiography, is oftentimes inaccurate [88]. 
It was therefore investigated to what extent the network’s estimation of fibrotic 
volume fractions is still reliable if the feature values were corrupted by noise (D). 
Furthermore, the analysis could shed light on key features that require accurate 
feature extraction methods for reliably estimating the amount of fibrosis with 
the proposed network. To recreate the clinical use case of inaccurately extracted 
features in a systematic way and controlled environment, the simulated P wave 
features and the anatomical measures were superimposed with noise. For each 
feature, 11 noise vectors were realized consisting of Gaussian noise with zero 
mean and a standard deviation (Oy) set to different fractions n with n € [0, 0.05, 
0.1, ..., 0.5] of the standard deviation of the respective feature distribution (os). 
By choosing Gaussian noise, it was possible to account for different levels of im- 
precisely extracted values occurring when applying automated feature extraction 
software. 

The different network configurations set up to address the different aspects 
described above are summarized in Table 8.3. 
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8.3. Results 


83 Resulls 


The resulting P wave features were analyzed regarding three different aspects: 
First, the influence of anatomical variability and electrode positions on the P wave 
features were compared to those caused by the presence of atrial fibrosis (sec- 
tion 8.3.1). Afterwards, it was investigated to what extent the volume fraction 
of fibrotic substrate resulted in altered P wave features (section 8.3.2). In sec- 
tion 8.3.3, it was analyzed if the effect of healthy anatomical variations on P wave 
features can be separated by a neural network from the feature changes resulting 
from the presence of fibrosis. 


8.3.1 Influence of Geometries, Rotation Angles and 
Electrode Positions on P wave Features 
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Figure 8.6: Exemplary ECGs resulting from a variation of anatomical and functional simula- 
tion parameters. From left to right: atrial geometry (small to large LA volumes range from 
light to dark blue), torso size (small to large torso diameters range from light to dark red), ro- 
tation angle (small to large rotation angles around the z axis range from light to dark orange), 
conduction velocity (small to large values range from light to dark blue) and fibrotic LA volume 
fraction (small to large LA fibrotic volume fractions range from light to dark turquoise). 
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Exemplary P waves (lead II and V1) for systematic variations of the atrial and 
thoracic geometry, the rotation angle, conduction velocity and fibrotic volume 
fraction are shown in Figure 8.6. The color code represents the dominant change 
underlying a variation of the respective factor, e.g., LA volume as a key property 
of atrial geometry alterations. The other dominant properties were the torso size 
in anterior-posterior direction, the rotation angle around the z-axis, conduction 
velocity and fibrotic LA volume fraction for the torso geometry, orientation angles, 
tissue-scale functional variability and fibrotic infiltration, respectively. The atrial 
geometry caused the largest deviations in P wave morphology. A change in torso 
size reflects predominantly in a variation of P wave amplitudes. Increasing CV 
values caused PWd to decrease. 

The visual observations above based on exemplary simulated P waves were 
extended by systematically evaluating P wave features of the complete in silico 
dataset. The individual influence of anatomical variations and the placement of 
electrodes on these features was assessed by varying one of these factors at a time 
while keeping the remaining ones constant at their reference value. In this way, 
the variance of each P wave feature resulting from a change of the atrial geometry, 
the torso geometry, the atrial rotation angle and in the case of PTF V1 also the 
position of the V1 electrode was analyzed quantitatively. 

Figure 8.7 shows the P wave feature distributions. To gauge the potential of 
one specific P wave feature to be a predictor for the fibrotic atrial volume fraction, 
also the P wave feature values resulting from fibrotic infiltration in the reference 
geometry are shown aside. 
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(a) Influence of anatomical factors on P wave derived timing fea- 
tures (duration and dispersion). 
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(b) Influence of anatomical factors on PTF V1 and P wave ampli- 
tude in lead V6. 


Figure 8.7: Influence of anatomical variations and electrode position on P wave features. The 
effect of atrial geometry, thoracic geometry, atrial rotation angles, V1 electrode placement 
and the fibrotic LA volume fraction on (a) PWd (top left), P wave dispersion (top right) and 
(b) PTF-V1 (bottom left), P wave amplitude in V6 (bottom right). The colored sample points 
indicate one major change resulting from a variation of the respective influencing factor which 
consist of the total LA volume (small to large LA volume from light to dark blue), the torso 
diameter in anterior-posterior direction (small to large diameter from light to dark red), the 
rotation angle around the z-axis (small to large angle from light to dark orange), the position 
of the V1 electrode along the inferior-superior direction (inferior to superior direction from 
light to dark purple) and the fibrotic LA volume fraction (0%-45% from light to dark turquoise). 
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Figure 8.7b reveals that all examined factors have an impact on the value of 
PTF V1. The largest variance is however caused by a change in the torso geometry. 
Thoracic variations characterized by a low diameter in anterior-posterior direction 
and atrial variations holding large LA volume values yielded increased PTF V1 
values, with some of them even above the threshold of 4mV-ms which is typically 
used when diagnosing structural heart abnormalities. The influence of the exact 
V1 electrode position on PTF V1 is visualized directly on the top right hand 
corner of Figure 8.7b. A rather superior placement of the V1 electrode on the 
thorax caused increased PTF V1 values. The interquartile ranges in Figure 8.7b 
show that the fibrotic LA volume fraction had a smaller effect on PTF V1 than 
any anatomical variation. PWd was not affected by torso size and rotation angle 
(Figure 8.7a, left panel). On the other hand, large LA volumes yielded P waves 
with durations up to 130 ms. Electrical and structural remodeling of fibrotic tissue 
resulted in PWds up to 160 ms. Due to the proximity of the V6 electrode to the LA 
lateral wall characterized by a high probability for the presence of fibrosis [145], 
the peak-to-peak amplitude is shown for lead V6 in Figure 8.7b. All four analyzed 
factors affected this feature, while the torso geometry caused the largest variation. 
P wave dispersion was mostly affected by the atrial geometry (Figure 8.7a, right 
panel). The maximum dispersion was 11 ms for the healthy anatomical variations 
and 13 ms in the presence of fibrosis. 


8.3.2 Effect of the Fibrotic LA Volume Fraction on P 
wave Features 


To examine to which extent the specific volume fraction of local atrial substrate 
modification causes graded changes in the P wave features, each set of 12-lead 
ECGs belonging to one particular volume fraction of fibrosis was analyzed at a 
time. As an example for the group of lead independent features, the distribution 
of PWd among the simulated and clinical cohort is shown in Figure 8.8a. When 
considering all anatomical and functional variations in the simulated dataset of 
642,400 ECGs as it is the case in Figure 8.8a, the value ranges of all features 
overlap between the healthy and all diseased cohorts. Of the three lead independent 
features (PWd, dispersion and PTF V1), PWd was the most discriminative single 
feature. 
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As an example for the lead dependent set of P wave features, the rms voltage 
values in the terminal 20ms of the P wave in lead I are shown in Figure 8.8b for 
each patient and degree of fibrosis. Feature values systematically decrease in the 


simulated cohort for an increasing left atrial fibrotic extent. In the clinical cohort 
a similar trend, albeit less pronounced, is visible. 


165 


Chapter 8. Estimation of Left Atrial Fibrosis 


iS] 
{=} 
© 


_ 
wn 
° 


=) 
Ss 
surface area fraction 


Low voltage endocardial 


P wave duration in ms 


YQ 
° 
© 


= 
° 
© 


P wave duration in ms 
a 
© 
p= 
Remodeled fibrotic 
tissue volume fraction 


ge ge ge ge ge ge ge ge 
° e SSE é < 


Left atrial fibrotic extent 


(a) Distribution of P wave duration among the clinical (top panel) and simulated cohort (bottom 
panel). 
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(b) Distribution of P wave amplitude in lead | among the clinical (top panel) and simulated 
cohort (bottom panel). 


Figure 8.8: Influence of left atrial fibrotic extent on P wave derived features. 
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8.3.3 Estimating the Amount of Fibrosis with Neural 
Networks 


The network set up as explained in section 8.2.4 was trained for the four different 
scenarios explained in section 8.2.4. In Table 8.4, the rmse values of the regression 
model for the different training scenarios summarized in Table 8.3 are outlined. 
The results for these training options, each designed to address the individual 
research questions, are unraveled in detail in the following sections. 
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8.3.3.1 Impact of Hybrid Training Data on Fibrosis Estimation 
Accuracy on Clinical Data 


When providing the regression neural network with all 75 extracted P wave 
features, the LA fibrotic volume fraction on a clinical test set was estimated 
with an absolute rmse averaged over all 6 cross-validation runs of 16.57% and 
17.56% in case the network was trained on a hybrid and only on a clinical dataset, 
respectively. 

Figure 8.9 shows the distribution of predicted volume fraction of fibrosis for 
the test sets included in the different cross-validation runs when trained on a 
hybrid (bottom panel) and only on a clinical dataset (top panel). 

When trained on a hybrid dataset, the network performance was increased 
especially for patients with a fibrotic extent in the range of [0%, 45%] which 
corresponds to the interval of fibrotic volume fraction that was defined for the 
simulations (compare data samples outlined in green in Figure 8.9). On the 
contrary, training on clinical data only yielded superior results for patients with 
an extraordinary high extent of fibrosis outside the ranges that were included in 
the simulated dataset (compare data samples outlined in red in Figure 8.9). When 
only considering clinical data with a low voltage area fraction <45 % as in the 
simulated dataset, the network performance error was comparable to the results in 
the simulated test sets and quantified to 12.28 %. 

The performance metric of the neural network is shown in Figure 8.10 when 
trained on a hybrid dataset (dots) or on clinical data only (triangles). The marker 
color indicates whether the ECGs in the test set for performance evaluation were 
extracted as a subset of the simulated (cyan) or the clinical (magenta) ECGs. 

In any case except for split 6, the estimation of the fibrotic extent was more 
accurate if simulated data were additionally included in the training set. The 
improvement when training the network with the hybrid dataset averaged over all 
clinical test sets was around 1%. 


8.3.3.2 Influence of Noise and Filter Settings Applied to Simulated 
Data on Network Performance 


The influence of adding realistic ECG noise [265] to the simulated data and 
subsequently filtering the signals choosing different cut off frequencies for the 
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Figure 8.9: Network performance for predicting the volume fraction of atrial fibrosis evalu- 
ated on a clinical test set when trained only on clinical data (top panel). The distribution of 
predicted vs. ground truth left atrial fibrotic extent when trained on a hybrid dataset is shown 
in the bottom panel in case of clinical (magenta) and simulated test data (cyan). 


low-pass filter is shown for the test sets of all 6 cross-validation runs in Figure 8.11. 
The solid line represents the turning point for which training on a hybrid dataset 
yielded improved regression results over training on clinical data only. 

Adding noise and selecting a low-pass cut off frequency of 150 Hz yielded the 
best performance for 3 out of 6 test sets each belonging to one cross-validation 
run (green boxes in Figure 8.11). As these filter settings were also applied for the 
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Figure 8.10: Performance of a neural network trained on a hybrid dataset (dots) and on clin- 
ical data only (triangle) when evaluated separately on a simulated (cyan) and clinical test set 
(magenta). 
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Figure 8.11: Influence of superimposed noise and low-pass filter cut off frequencies on net- 
work performance. The difference in the network’s rmse between training on clinical data and 
on a hybrid dataset is shown for all 6 cross-validation runs for different filter configurations 
applied to simulated data (faded pink: noise-free, without filtering; pink: superimposed noise 
and subsequent low-pass filtering with 150 Hz, purple: superimposed noise and subsequent 
low-pass filtering with 150 Hz). 
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analysis described in section 8.3.3.1, it was shown that these filter settings also 
yielded improved results for 2 further test sets over the case where only clinical 
data was used. Only for split 6, training on clinical data resulted in a smaller 
rmse. Out of the three low-pass filter options, a cut off frequency of 40 Hz applied 
to the simulated data led to the poorest performance (compare dark pink dots in 
Figure 8.11), as the averaged rmse over all 6 test sets was 17.51% in contrast 
to 16.57% when setting the cut-off frequency to 150 Hz and 17.56% when only 
training on clinical data. 


8.3.3.3 Added Value of Anatomical Measures as Additional Input 
Data 


In Figure 8.12, the estimated and ground truth fibrotic volume fraction output 
by a network trained with and evaluated on the simulated data are shown when 
only providing 75 P wave derived features as input to the network. The scatter 
points represent selected samples in the test splits and their face colors encode 
their respective total LA volume (top panel) and torso volume (bottom panel). 

The visualization indicates that the prediction performance decreased for 
samples with extraordinary high and low LA and torso volumes. Especially, a 
systematic under- and overestimation for low and high LA volumes, respectively, 
prevailed. Thus, 5 additional non-invasively measurable features representing LA 
and RA volume, torso volume and torso diameter in anterior-posterior direction in 
the chest and the abdominal region were included for training the network. In this 
case, the rmse between the predicted and ground truth volume fractions covered 
by fibrosis decreased from 8.69% to 7.92% . 


8.3.3.4 Network Sensitivity to Inaccurately Extracted Features 


Figure 8.13 shows the rmse of the network output averaged over all cross- 
validation test sets depending on the applied noise level n to the input feature 
values. The decrease of the network performance is depicted for 10 out of the 80 
input features for which the network’s rmse was affected the most. The absolute 
rmse estimated by the network increased by 1% compared to the noise-free base- 
line case for a noise level of Oy / Os = 0.2 for the torso volume (corresponding to 
an absolute noise standard deviation of oy = 0.0028 m?) and a noise level of oy / 
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Figure 8.12: Network performance for predicting the volume fraction of atrial fibrosis only 
based on P wave derived features. The color code represents the LA volume (top) and torso 
volume (bottom) of each individual sample point (light: small volume; dark: high volume). The 
solid black line represents perfect prediction. 


Os ~ 0.3 for the torso diameter in anterior-posterior direction in the abdominal 
region (Oy = 14.36 mm). Besides the anatomical measures, a distortion of rms 
voltage features predominantly in the terminal 30 ms of the P wave caused the 
most marked decrease in network performance compared to the noise-free baseline 
case. 
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Figure 8.13: Sensitivity of the network towards inaccurately extracted features. The rmse of 
the network is shown for different levels of noise added to the specific feature values one at a 
time. The solid horizontal line represents a drop of the network performance of 1% compared 
to the noise-free baseline case. 


8.4 Discussion 


8.4.1 Main Findings 


In this work, a total of 642,400 simulated 12-lead ECGs of healthy subjects and 
patients with different LA volume fractions covered with fibrosis were generated. 
Different atrial and thoracic anatomical models derived from SSMs as well as 
varying orientation angles of the atria within the torso are hallmarks of the virtual 
cohorts. Additionally, different regionally heterogeneous sets of CVs were applied 
representing inter-individual CV variation in healthy subjects causing additional 
P wave feature variability in all model cohorts. P wave features including duration, 
PTF V1, peak-to-peak amplitudes, rms voltages in the terminal 40, 30 and 20 ms, 
P wave integral and dispersion were calculated for each signal. The influence of 
anatomical properties on these features’ variances was compared to the variance 
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caused by fibrotic infiltration of the atrial tissue of various degrees. None of the 
investigated features showed distinct ranges for the diseased cohort and different 
healthy anatomical variations. 

The intervals of all feature values occurring from anatomical and functional 
variations were greater than or equal to the variability in case of fibrotic infiltration 
of atrial tissue (compare Figure 8.7). The atrial geometry variation caused altered 
P wave morphologies resulting in varying values for all features, whereas the 
torso geometry variation mainly affected P wave amplitudes. Nevertheless, all 
investigated features were characterized by a systematic change in their values 
for an increasing volume fraction of fibrotic atrial tissue (compare Figure 8.8a 
and 8.8b). 

The electrodes for the lateral ECG leads V5 and V6 are located closest to the 
LA lateral wall. According to the findings reported by Highuchi er al. [145], which 
the distribution of fibrosis in the computational models was based on, the presence 
of fibrotic tissue in this region holds a considerable high probability. Therefore, the 
amplitude decrease in V5 and V6 [122] that were found for an increasing volume 
fraction of LA fibrosis can be explained by a growing amount of passive fibrotic 
elements not contributing to the overall source distribution in the left pulmonary 
vein (PV) antrum and the LA lateral wall. Rms voltages in the terminal 40, 30 and 
20 ms systematically decreased for an increasing extent of atrial fibrosis which 
occurs due to delayed activation of fibrotic patches causing low voltage ECG signal 
parts ensuing the normal depolarization of healthy myocardial tissue. Accordingly, 
also PWd systematically increased with the amount of LA fibrosis. Thereby, a 
careful choice of the threshold for detecting the P wave ending in each lead is 
crucial as already reported by Jadidi er al. [273]. In this simulation study, a simple 
amplitude threshold of 1.5 - 10 mV could have been chosen. However, when 
changing this threshold value to 3 - 10 3 mV, PWd doesn’t show the steady increase 
for different fibrotic LA volume fractions as shown in Figure 8.8a. In this case, 
the low voltage signal parts at the end of the P wave caused by delayed activation 
of fibrotic regions in the LA are ignored, which results in underestimation of PWd. 
This implies that a sufficiently high signal-to-noise ratio is required for clinically 
recorded signals in order to apply sensitive thresholds to accurately measure PWd. 

A neural network provided with the aforementioned features of the simulated 
data succeeded in estimating the volume fraction of atrial fibrosis with an average 
error of 8.69% fibrotic LA volume. When also including anatomical measures for 
atria or torso, the rmse of the regression network could be improved and decreased 
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to 7.92% fibrotic volume. Thus, the results of this simulation study suggest that 
1t is beneficial to provide the network with additional anatomical measures non- 
invasively acquirable by estimating the torso volume with measurements of the 
torso perimeter and the atrial diameter based e.g. on echocardiographic record- 
ings to further increase the prediction accuracy. By comparing the overlapping 
interquartile ranges in Figure 8.8a to those in Figure 8.12, it can be inferred that 
the volume fraction of fibrotic substrate can be estimated more accurately with 
ECG-based machine learning approaches than by isolatedly considering single 
P wave features, such as e.g., P wave duration. Therefore, the network indeed 
seems to be capable of separating the effects of anatomical variability from the 
influence of fibrotic substrate on the P wave. 

The impact of inaccurately determined features — as it is likely the case in a 
clinical setting — was investigated by adding Gaussian noise to the robustly and 
accurately extractable feature values from simulated data. It was found that the 
network’s rmse increased the most if noisy values for the torso volume, torso 
diameter and the rms voltages in the terminal sections of the P wave were provided 
to the network independent on how the data were split for training, testing and 
validation. Likely reasons for these findings could involve that the presence of 
fibrotic tissue reflects in a decrease of the P wave amplitudes that are also affected 
by the torso volume and could not be properly compensated for if this entity 
was extracted inaccurately. Furthermore, the reduced CV in the fibrotic regions 
causes rms voltages in the terminal P wave intervals to decrease. For this reason, 
these features are important to separate the changes in P wave features resulting 
from fibrotic infiltration of atrial tissue from those caused by healthy anatomical 
variations and thus must be accurately measured. The network was able to 
generalize well to ECGs simulated with unseen atrial and torso geometries if 
ECGs generated with different geometries, but similar atrial and thoracic volumes 
were included during training [122]. 

Moreover, the feasibility of non-invasively estimating the amount of fibrosis in 
the atria by using features from clinical 12-lead ECGs as an input to a feedforward 
neural network was demonstrated. The rmse between the estimated and the 
ground truth fibrotic volume fraction on a test set composed of clinical signals 
was 17.56% when using only clinically recorded ECGs during training of the 
network. The error was reduced to 16.57% by additionally including simulated 
data during training. Non-matching absolute values in the feature distributions 
between the simulated and clinical cohort, for example in case of PWd (compare 
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Figure 8.8a) could likely be a reason for the limited, though visible, added value 
that simulated data carry when used as an additional input dataset. However, 
measuring P wave duration automatically in clinical data remains challenging, 
especially in FAM patients where low voltage signal parts are not uncommon to 
occur after the presumably detected P wave ending. Averaging multiple P waves 
of the same patient could provide a possibility to reduce noise in the signal and 
to potentially more accurately determine the P wave ending. However, as ECGs 
of only 27 patients were available in the clinical dataset of this study, averaging 
over multiple beats was not performed for the sake of keeping the P wave dataset 
as large as possible. An incorrectly detected P wave offset also affects other 
P wave features in the set of network input data. Among them are the rms voltages 
in the terminal P wave intervals which also turned out to impair the network 
performance notably if not assigned their correct values (compare section 8.3.3.4). 
Compared to current state-of-the-art methods to quantify the fibrotic extent in 
the atria, the ECG-based estimation can be a reasonable indicator for fibrosis 
infiltrating the atrial myocardium, but does not provide quantitatively as accurate 
results as Invasive mapping procedures or expensive imaging techniques. 

Adding noise to the simulated data prior to feature extraction was important 
to reduce the estimation error with the hybrid training dataset as this seems to 
contribute to closing the domain gap between clinical and simulated ECGs. When 
including features directly extracted from the raw simulated ECGs during train- 
ing, the network’s performance declined from 16.57% (150 Hz low-pass cut off 
frequency) to 17.10% (without added noise and filtering) which is comparable to 
the performance in case the network is only trained on clinical data. Furthermore, 
choosing appropriate filter settings for all signals was necessary for a successful 
fibrosis estimation. Applying a low-pass filter with a cut-off frequency of 40 Hz 
instead of the chosen 150 Hz, the network performance could only be improved 
for 2 out of 6 splits when additionally providing simulated data for training. This 
highlights the need of preserving subtle ECG characteristics that might arise due 
to delayed and scattered depolarization of the tissue in fibrotic areas and the 
necessity of recording clinical signals of high quality. 
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8.4.2 Related Work 


Yoshizawa et al. [272] reported that new-onset AF could be estimated using 
P wave amplitude in lead II and P wave dispersion features with a sensitivity of 
69.1 % and specificity of 88.2 % in their clinical study comprising 68 AF patients 
and the same number of controls. Lankveld et al. [289] drew on several time- and 
frequency domain features to predict AF recurrence rates after pulmonary vein 
isolation (PVI) in 93 patients with an area under the curve (AUC) of 0.76. In the 
clinical study conducted by Nakatani et al. [288], a combination of several P wave 
amplitudes led to a sensitivity of 69 %, a specificity of 88 % and an AUC of 0.77 
for predicting the presence of low voltage areas >10 % in 50 AF patients. Jadidi 
et al. [273] found that a PWd above 150 ms — provided a very low threshold is set 
for detecting the P wave offset — was a predictor for advanced LA low voltage 
substrate with a sensitivity of 94.3 % and a specificity of 91.7 %. Conte et al. [302] 
report that PWd and beat-to-beat variability of P wave morphologies held the 
highest discriminative power to identify patients holding an increased risk of AF 
occurrence. 

Especially sensitivity and AUC results for discriminating between the healthy 
and the FAM group were higher in this simulation study compared to the findings 
in previous clinical studies [122]. Possible reasons could involve the number of 
P waves included in this work (642,400 simulated P waves compared to <100 ECG 
recordings in clinical studies) leading to a larger database for the network to learn 
relations between P wave features and fibrotic LA volume fraction. Moreover, it 
was found that additional anatomical measures improve the estimation outcome. 
On the other hand, it was investigated in this study whether detecting fibrotic 
substrate extent in the atria is feasible, whereas most clinical studies focused on 
predicting AF recurrence rates and new-onset AF episodes. Even though the latter 
are reported to correlate with the amount of the fibrotic LA volume fraction, there 
is no clear 1:1 relation between them complicating the comparison between this 
study and those conducted by Yoshizawa et al. [272] and Lankveld et al. [289]. 
Besides that, the chosen set of investigated P wave features in this study resembles 
but does not exactly equal the one used in previous clinical studies. The set of 
features was indeed chosen based on previous findings, but a combination of a vast 
number of features was used compared to clinical studies [272, 289]. In contrast, 
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beat-to-beat alterations were not evaluated in this study as proposed by Conte et 
al. [302] since the simulation setup only allows for a single beat analysis. 


8.4.3 Limitations 


The analyses and results of this study are to large parts based on simulated data. 
Even though an established modeling methodology for fibrotic tissue was chosen 
covering versatile aspects of electrical and structural remodeling on cell and tissue 
level, other modeling methods could have led to different results [82]. In this 
context, future directions could involve to examine if the proposed method is 
capable of detecting fibrotic tissue in signals generated with any fibrosis remodel- 
ing methodology or if it is particularly sensitive towards either ionic remodeling, 
locally heterogeneous conduction velocities or percolation effects. Further future 
directions could include an investigation of how the specific location of fibrotic 
patches influences the accuracy of the estimation of left and right atrial fibrosis. 
To generate simulated signals at scale, yet at a reasonable computational cost, 
the Eikonal model (see section 3.1.2.3) as a propagation driver and the infinite 
volume conductor (see section 3.1.3.3) as a simplified forward calculation method 
was drawn on. On the one hand, the Eikonal model is capable of faithfully re- 
producing LATs obtained with the bidomain model for sinus rhythm simulations 
on fibrotically infiltrated atrial models. Building on the analyses described in 
chapter 4, no substantial error is expected to have impaired the computed ac- 
tivation sequence and the ECGs derived therefrom. However, full bidomain, 
pseudo-bidomain or reaction-Eikonal [104] simulations could account for diffu- 
sion effects neglected when deriving the source distribution by only solving the 
Eikonal equation and shifting pre-computed action potentials in time. On the other 
hand, the infinite volume conductor was found to systematically overestimate 
P wave amplitudes in leads V1-V3 compared to the finite element method [303] 
(compare chapter 4). As this affects most of the features extracted from V1-V3 
that were employed for the neural network, especially the amplitudes, a domain 
shift between the simulated and the clinical data for the affected features might 
have limited the benefits of providing more input variability through simulated 
signals during training. However, more sophisticated modeling approaches require 
considerable longer computation times compared to the methods employed in 
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this study. The latter were therefore intentionally chosen for the sake of reducing 
computational cost to make the generation of a large database feasible. 

The fibrosis distribution in the virtual patient cohort was set based on a spatial 
histogram of high image intensity ratios in late Gadolinium enhanced magnetic 
resonance images [145]. Thus, the simulation dataset is mainly characterized 
by fibrotic patches on the posterior left atrial wall. Opposite to this, the clinical 
electroanatomical voltage maps employed for extracting the ground truth fraction 
of fibrosis on the endocardium mainly exhibit low voltage areas on the anterior 
left atrial wall [304]. Furthermore, the fibrotic volume fractions defined for the 
virtual patient cohort ranged only up to 45 %, whereas the maximum surface 
area fraction of low voltage electrograms in the clinical dataset was 77.28 %. 
Therefore, the lack of P wave features pertaining to high fibrotic extents in the 
in silico training set could have caused the underestimation of fibrotic volume 
fractions in patients characterized by large low voltage areas in the clinical test 
set and therefore reduced the overall network performance. This is also visible in 
Figure 8.9 as the estimated fibrotic extent complies with the ground truth values 
to a markedly higher degree for patients with low voltage areas <45 %. 

Moreover, the impact of noise being added to simulated data was studied with 
regard to the added value that simulated signals can carry to enrich a clinical 
dataset on the one hand (also compare section 8.2.2) and regarding the sensitiv- 
ity of the network output towards inaccurate feature values on the other hand. 
However, in the simulation study, only the network’s sensitivity in case of noise 
being added to a single input feature was analyzed. In clinical practice though, 
measurement uncertainty usually affects multiple features at once. Therefore, 
clinical ECG recordings as input for the network require robust signal processing 
methods for extracting key features accurately. However, the impact of noise on 
robust and accurate feature extraction [305] was not particularly analyzed in this 
study. While noise and a QRS complex immediately following the P wave will 
definitely impede the feature extraction if averaging over several beats is not feasi- 
ble, the focus of this study was not to develop robust feature extraction and signal 
processing methods. Instead, the intention was to probe the general potential of 
P wave features as predictors for the presence of atrial fibrosis provided that their 
values can be accurately extracted from the ECG. 

Since a large-scale database of more than 700,000 12-lead ECG P waves from 
healthy and diseased simulated and clinical cohorts is now available, it could also 
be worth to use the signals directly as an input for a deep learning network to 
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estimate the fibrotic LA volume fraction without prior feature extraction. Thereby, 
the network could directly learn relations between the different cohorts and derive 
rules to distinguish between them. Additionally, the effect of different ablation 
strategies (PVI vs. additional ablation targets) and their effect on the 12-lead ECG 
could be examined in a future study and reveal new insights regarding individual 
therapy planning on a sub-population level. 


8.5 Conclusion 


Given the results of the study presented in this chapter, the ECG constitutes a 
non-invasive and widely available tool in clinical practice to indicate left atrial 
volume fraction of fibrotic tissue up to an uncertainty of around 16 %. Moreover, 
simulated signals of a virtual patient cohort covering anatomical, functional and 
pathological variability can contribute to reduce the estimation error. 
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Detection of Left Atriql 
Enlargement 


In this chapter, the benefits of extending clinical electrocardiograms (ECGs) by 
simulated training data derived from a bi-atrial statistical shape model (SSM) 
to improve the automated detection of left atrial enlargement (LAE) based on 
P waves of the 12-lead ECG are demonstrated. 


The content of this chapter is taken from conference proceedings that have been 
published in Lecture Notes in Computer Science (LNCS) [121]. Most passages in 
this chapter have been quoted verbatim from the publication and are adapted or 
reprinted by permission from Springer Nature Customer Service Centre GmbH: 
Springer Nature, Statistical Atlases and Computational Models of the Heart. 
Multi-Disease, Multi-View, and Multi-Center Right Ventricular Segmentation 
in Cardiac MRI Challenge (A Bi-atrial Statistical Shape Model as a Basis to 
Classify Left Atrial Enlargement from Simulated and Clinical 12-Lead ECGs, C. 
Nagel, M. Schaufelberger, O. Doessel, A. Loewe), Copyright: Springer Nature 
Switzerland AG (2022). 
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91 Introduction 


LAE is not only among the risk factors and an indicator for hypertensive heart 
disease [87, 306], but contributes also to the arrhythmogenesis and maintenance 
of atrial fibrillation (AF) [65, 86, 218]. Thus, an established LAE diagnosis could 
serve as a valuable risk marker for the early detection of AF which is crucial for 
choosing the appropriate patient-specific treatment therapy. 

As an alternative to cardiac chamber size measurements with imaging tech- 
niques, a clinical diagnosis of LAE could initially also be informed by an evalu- 
ation of the 12-lead ECG [307-310]. The altered ECG signal characteristics in 
LAE patients comprise on one side an increased duration and absolute amplitude 
in the negative deflection of the P wave in lead V1 [308, 309]. Furthermore, a 
dilation of the left atrium also reflects in an increased overall duration of the 
P wave in all leads since the depolarization wavefront needs to traverse a larger 
surface area [308]. Additionally, a bifid P wave with an interval >40 ms between 
both peaks in lead II is an indicator for P mitrale [310]. In contrast to imag- 
ing techniques applied to quantify the atrial chamber sizes, the ECG features 
an inexpensive, easily accessible and widely available tool in clinical practice. 
An automated analysis of ECG signals with e.g. machine learning techniques 
could facilitate the early diagnosis of LAE and in turn contribute to a proper risk 
stratification for AF [311, 312]. 

However, the application of machine learning algorithms to clinical ECGs for 
an automated disease classification entails several challenges. On the one hand, 
large clinical datasets are rarely available and also the diagnostic classes of interest 
are usually not balanced [313-315]. In spite of being the largest publicly available 
online ECG database to date, PTB-XL [16] comprises 9,528 12-lead ECGs from 
healthy individuals, but yet only 421 signals of LAE patients. A failure to meet 
the requirement of evenly distributed samples in the classes to be distinguished 
by the machine learning model, a classification bias is likely to be introduced 
at the expense of prediction accuracy for the minority class. Moreover, ground 
truth labels marking the underlying cardiac pathology have to be set manually 
by cardiologists. However, expert annotations are inevitably subject to inter- and 
intra-observer variabilities contributing to unreliably and inconsistently labeled 
signals that can severely impair the performance of a machine learning classifier 
[24]. 
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These limitations call for simulated ECG signals as an additional data source 
with precisely known ground truth labels and arbitrarily selectable class distribu- 
tions to overcome the above mentioned drawbacks of clinical signals. Especially 
for the use case of identifying LAE, a bi-atrial SSM carries the potential to produce 
a vast amount of atrial geometries with predefined and equally distributed left 
atrial volumes that can be employed for electrophysiological simulations to obtain 
simulated P waves of the 12-lead ECG. It was therefore investigated if synthetic 
P waves resulting from simulations carried out on anatomical models derived as 
instances from a bi-atrial SSM can improve the detection of clinical LAE ECGs. 


92 Methods 


92.1 Database 


9.2.1.1 Simulated Data 


The bi-atrial SSM described in chapter 5 [169] was used to generate 95 anatomical 
models of the atria augmented with fiber orientation, inter-atrial connections and 
tags for anatomical structures. These models were made publicly available [200]. 
The eigenmode coefficients of the SSM were chosen such that the left atrial 
volumes of the 95 geometries were uniformly distributed in a range of 30 - 65 ml, 
covering the left atrium (LA) volume range reported in a large clinical cohort 
study including a healthy control group and LAE patients [219]. For the computer 
models, left atrial volumes (LAVs) were calculated as the volume bounded by the 
surface of the left atrial body as shown in Fig. 9.1. The left atrial appendage as well 
as the pulmonary vein ostia were excluded for the volume assessment. The atrial 
SSM was built based on magnetic resonance (MR) and computed tomography 
(CT) segmentations but clinical LAV reference values are usually specified based 
on 2D echocardiography data. Therefore, each initially calculated LAV value of 
the virtual cohort was multiplied with a correction factor of 0.75 to account for the 
systematic underestimation of the atrial size using echocardiography compared to 
MR and CT measurements [316]. Examples of three atrial instances with a left 
atrial volume of 31 ml, 47 ml and 65 ml are shown in Fig. 9.1. 
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LAV = 30.73 ml LAV = 47.23 ml LAV = 64.71 ml View 


Left atrial volume (LAV) 


Figure 9.1: Examples of three atrial geometries (endo- and epicardium) derived from the 
SSM with different LAV values. Anterior and posterior point of views are shown in the top 
and bottom panel, respectively. The LAV is marked in blue and was calculated as the volume 
enclosed by the surface of the left atrial body excluding the left atrial appendage area and 
the pulmonary vein ostia. Figure adapted from Nagel et al. [121] with permission from the 
publisher. 


For each atrial model, a baseline conduction velocity (CV) was assigned based 
on the values reported in [245] to 4 different atrial regions as listed in Table 9.1 and 
depicted in Figure 9.2. Nine variants of this CV setup were defined by modifying 
the velocities randomly in the interval [80 %, 120 %] of the baseline CV in each 
region. For each model and each CV setup, the Eikonal equation was solved 
using a mesh resolution of 1.22 mm with the Fast Iterative Method to obtain local 
activation times (LATs). Excitation was initiated at a sinus node exit site located 
at the junction of the right atrial appendage and the superior vena cava. Shifting 
a precomputed Courtemanche et al. [91] action potential in time according to 
the resulting LATs yielded the spatio-temporal distribution of the transmembrane 
voltage in the atria [299]. Each atrial geometry was rotated by a random rotation 
angle drawn from a uniform distribution in a range of [-15, 15]° around the x-, y- 
and z-axis [295] and were placed in two different torso geometries (see Fig. 9.4) 
derived from a human body SSM [179]. Torso 1 represents a male subject (body 
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surface area (BSA) = 1.7 m?) and torso 2 a female subject (BSA = 1.2 m?). The 
ECG forward problem was solved by means of the boundary element method 
as implemented by Stenroos et al.[107] and 12-lead ECGs were extracted at the 
standardized electrode positions. In this way, 1900 (95 atrial geometries x 2 torso 
geometries x 10 CV settings) simulated ECGs were generated . 


Table 9.1: Conduction velocities in transversal fiber direction (CV |) and anisotropy ratios (AR) 
for the baseline CV setup. 


Atrial Region CV, inm/s | AR=CV\/CV, 
Bulk tissue 0.59 2.11 
Bachmann’s bundle 0.64 3.33 
Crista terminalis 0.59 2.84 
Pectinate muscles 0.46 3.78 
Bachmann’s bundle Crista terminalis Pectinate muscles 


N 


//)) ` 


Figure 9.2: Example atrial geometry with three regions that are assigned different conduction 
velocities and anisotropy ratios. The bulk tissue makes up for the remaining areas in the left 
and right atrium. Figure adapted from Nagel et al. [121] with permission from the publisher. 


Ground truth labels for the simulated dataset were set based on the left atrial 
volume indexed to the body surface area (LAV/BSA). Applying the cutoff value for 
LAV/BSA of 34 ml/m2 as recommended in the cardiac chamber size quantification 
guidelines [88] for 2D echocardiography derived data resulted in 1,050 healthy 
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Figure 9.3: Examples for the P waves of the 12-lead ECG resulting from electrophysiological 
simulations conducted on the three models with different left atrial volumes as shown in 
Figure 9.1 calculated using constant rotation angles and the same torso model. Figure adapted 
from Nagel et al. [121] with permission from the publisher. 


and 850 LAE signals in the simulated database. Left and right atrial volumes 
indexed to the body surface area are visualized in Fig. 9.4 for both torso models 
and all 95 atrial geometries. 


9.2.1.2 Clinical Data 


Clinical ECGs of 10 seconds length from 9,485 healthy subjects and 421 LAE 
patients sampled at 500 Hz were extracted from the publicly available PTB-XL 
ECG database [16]. For 7,168 healthy and 309 LAE signals in this database, the 
certainty with which the expert cardiologist labeled the respective pathology, was 
specified as 100%. ECGdeli [250] was applied to extract the P waves from the 
time series of all healthy and LAE signals and to build a mean P wave template 
for each subject (see Figure 9.5) 
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Figure 9.4: Left (blue) and right (red) atrial volumes indexed to the respective body surface 
area (BSA) for both torso models. Torso model 1 and 2 had a BSA of 1.7 m? and 1.2 m, re- 
spectively, and are depicted together with the positions of the attached electrodes from the 
anterior and lateral view. Applying a threshold value of 34 ml/m2 to the LAV/BSA values 
yields the ground truth labels for the healthy (NORM) and the left atrial enlargement (LAE) 
cohorts. Figure adapted from Nagel et al. [121] with permission from the publisher. 


9.2.2 Machine Learning Classifier 


A long short-term memory (LSTM) network was trained to perform the binary 
classification between the healthy control group (NORM) and the left atrial en- 
largement cohort (LAE). The simulated P waves and the clinical P wave templates 
were filtered (0.5 Hz highpass cutoff frequency, 60 Hz lowpass cutoff frequency) 
and cut 40 samples (equivalent to 80 ms) before and 40 samples after the P wave 
peak in lead II occurred. In this way, it was ensured that the P waves are robustly 
and consistently extracted throughout the simulated and clinical dataset. The 
resulting time series signals of all 12 leads served as an input to the network. 70 %, 
15% and 15 % of the ECGs in both classes were split into a training, validation 

and test set, respectively. Three different training scenarios were considered. For 
scenario 1, only clinical signals with ground truth labels annotated with a certainty 

of 100 % were considered. For training scenario 2, the 1900 simulated P waves 
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Figure 9.5: Averaging single beat P waves for each 12-lead ECG in the clinical dataset. This 
procedure was performed to obtain a mean P wave per subject and patient. An exemplary 
clinical 12-lead ECG is visualized in blue along with the ECGdeli annotations for P wave on- 
and offset in each beat. Averaging over all P waves detected in the signal trace resulted in a 
mean P wave template for each subject. 


were additionally used as an input data source during training. The validation 
and test sets remained unchanged compared to scenario 1. In scenario 3, also the 
clinical NORM and LAE signals annotated with a certainty <100 % by the expert 
cardiologist were added to the training split. The train, validation and test compo- 
sitions for all training scenarios are summarized in Table 9.2. Accuracy, sensitivity, 
specificity metrics were calculated to evaluate the network performance of each 
training scenario, whereby the LAE signals were considered positive samples and 
the NORM signals negative samples (see section 3.4 and Figure 3.4). To avoid a 
bias of the network due to the larger number of samples in the NORM class in all 
training scenarios, the cross-entropy loss function was weighted by the inverse of 
the respective class support. 
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Table 9.2: Number of samples in the healthy (NORM) and the left atrial enlargement (LAE) 
classes for different training scenarios. In scenario 1, a subset of the reliably labeled clinical 
data is used for training. For training scenario 2, simulated ECGs served as an additional input 
to the clinical training data from scenario 1. In scenario 3, clinical ECGs for which ground truth 
labels were specified with a certainty <100 % by the expert cardiologist were added to the 
training data. For all three cases, the same validation set and a test set comprising each 15% 
of the reliably labeled clinical data was used. 


Training Scenario 1 | Training Scenario2 | Training Scenario 3 Validation Test 
es NORM: 5018 NORM: 5018 NORM: 7335 NORM: 1075 | NORM: 1075 
clinica 
LAE: 216 LAE: 216 LAE: 328 LAE: 47 LAE: 46 
NORM: 1050 
simulated - - - 
LAE: 850 
NORM: 5018 NORM: 6068 NORM: 7335 NORM: 1075 | NORM: 1075 
total 
LAE: 216 LAE: 1066 LAE: 328 LAE: 47 LAE: 46 


93 Resulls 


Figure 9.6 depicts the confusion matrix for all training scenarios evaluated on 
the test set. Sensitivity, specificity and accuracy obtained for training scenario 1 
were 0.83, 0.92 and 0.91, respectively. When additionally including simulated 
ECGs during training, as was the case in scenario 2, the results for sensitivity, 
specificity and accuracy quantified to 0.87, 0.95 and 0.95, respectively. Thus, 
all three performance metrics improved when including simulated data for the 
training of the classifier. When extending the clinical training dataset of scenario 
1 with additional clinical signals labeled with a certainty <100 % in scenario 3, 
sensitivity, specificity and accuracy decreased to 0.78, 0.84 and 0.83 compared to 
the training scenario comprising only reliably labeled data. 
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Figure 9.6: Confusion matrices and performance metrices resulting from predicting the labels 
of the test set with the network trained with each of the three training sets summarized in 
Table 9.2. Figure adapted from Nagel et al. [121] with permission from the publisher. 


94 Discussion 


9.4.1 Main Findings 


It was investigated whether the application of a bi-atrial SSM for generating a large 
synthetic ECG database can yield valuable additional input data for the training 
of a machine learning classifier to distinguish between clinical healthy and LAE 
ECGs. It was shown that the performance metrics of the network increased if 
simulated ECGs were added to reliably labeled clinical training data. The lowest 
scores for sensitivity, specificity and accuracy were achieved when additional 
clinical data with uncertain ground truth labels were included during training. 
Thus, expanding a clinical training dataset with simulated data can be preferable 
to either resigning oneself to a smaller clinical training split or extending the latter 
with additional unreliably tagged clinical data. 
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The advantage of deriving various in silico models of the atria from a SSM is on 
the one hand, that an arbitrary amount of geometries can be generated representing 
a large virtual patient cohort. In this way, a large number of simulated ECGs can 
be generated contributing to the extension of the training data which otherwise 
typically only consist of a small amount of clinical signals. In this study, not 
only anatomical variability was incorporated by varying the atrial and thoracic 
geometries but also functional variability was added consisting of 10 different CV 
setups for each atrial model to capture inter-patient ECG variabilities to an even 
greater extent. On the other hand, the shape of the individual instances can be 
freely chosen by optimizing the eigenvector coefficients of the SSM. This implies 
that in the particular use case of identifying LAE based on 12-lead ECGs with 
machine learning techniques, the geometries can hold uniformly distributed left 
atrial volumes. In this way, the ground truth of the underlying pathologies for 
each signal is precisely known and also the distribution of the signals in the two 
classes can be chosen a priori. In doing so, the class imbalance for the training 
set was substantially reduced from 1:23 to 1:6. Furthermore, two different torso 
geometries were derived, representing a male and a female geometry. In this 
way, a one-to-one distribution of male and female ECGs in the simulated dataset 
prevailed contributing to a compensation of the gender bias in medical datasets. 

The class imbalance during training was addressed by multiplying the cross- 
entropy loss function with the inverse of the class support. Considering sensitivity 
and specificity metrics in addition to evaluating the network’s accuracy perfor- 
mance demonstrated that the class-wise accuracy was above 0.8 in all scenarios 
and thus, no notable overfitting towards the over-represented NORM class, es- 
pecially in those scenarios where the classifier was only trained on clinical data, 
occurred. 


9.4.2 Related Work 


One key aspect for a successful and meaningful application of simulated ECGs to 
classification problems with clinical signals is assessing fidelity and comparability 
of the simulated data to the clinical ECGs and previously published simulation 
studies. In chapter 5 [169] it was shown that the P wave duration distribution 
originating from simulations on instances of the SSM with Gaussian distributed 
eigenvector coefficients is in accurate accordance with P wave durations reported 
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in large clinical cohort studies of healthy subjects. The altered ECG signal 
characteristics in LAE patients reported in clinical studies and comprising an 
increased duration and absolute amplitude in the negative deflection of the P wave 
in lead V1 as well as an increased overall duration of P wave in all leads were 
also reflected in the simulation results (see Figure 9.3). However, a data-driven 
comparison between the synthetic and the clinical dataset for the healthy and 
LAE cases would further help to assess fidelity of the simulated ECGs (compare 
chapter 6). 

In this study, the absolute amplitude and the duration of both, the positive 
and the negative deflection of the P wave in lead V1 increased with the left atrial 
volume (see Fig. 9.3). Andlauer et al. [176, 317] reported that left atrial concentric 
hypertrophy causes an increase in the absolute P wave amplitude of the negative 
deflection in V1, whereas left atrial dilation with a constant myocardial volume 
does not have a marked effect on the ECG morphology in V1. In contrast to the 
work presented in this chapter, the shape of the right atrium remained constant 
which might explain the additional alterations in the positive deflection in V1 
for the simulation results. Furthermore, for any stage of left atrial dilation, the 
myocardial volume was kept constant in the study conducted by Andlauer et 
al. [176], implying that the wall thickness decreased with an increasing left atrial 
volume. In the study shown in this chapter, a constant wall thickness of 3 mm 
was ensured for all atrial geometries, i.e. the myocardial volume increases with 
the left atrial size. This could thus explain the absolute increase of the negative 
amplitude in V1 for increasing LAV values as seen in Fig. 9.3 and reported for 
left atrial hypertrophy by Andlauer er al. [176]. 


9.4.3 Limitations 


The network used for the classification was not explicitly optimized. A standard 
setup was used for a binary LSTM classifier. Tuning the network’s hyperparame- 
ters or using a different network architecture might further improve the results. 
However, in this work, the focus was not on developing a robust classification 
method for healthy and LAE patients in the first place. Instead, the hypothesis 
that applying a bi-atrial SSM as a basis to generate a large database of synthetic 
simulation-derived electrophysiological signals with a predefined class distribu- 
tion and precisely known ground truth labels can improve real-world classifier 
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performance was put to test. It was shown that this approach can help to overcome 
the drawbacks of only using clinically recorded ECGs as the classifier perfor- 
mance improved when a combination of clinical and simulated data was employed 
during training. 

Resorting to the Eikonal model and the boundary element method as sim- 
plified model solutions for normal sinus rhythm simulations has been shown to 
yield P waves of the 12-lead ECG resembling the full bidomain simulation results 
to a high degree [164, 303]. In this study, two torso models were used for the 
generation of the in silico database. As shown previously [122] and reported in 
chapter 8, the torso geometry mainly has an influence on the ECG amplitudes, 
whereas the P wave morphology is primarily dependent on the atrial geometry and 
functional variability. Therefore, 95 atrial geometries were considered with 10 dif- 
ferent conduction velocity settings and only 2 different torso models representing 
a male and a female thoracic geometry were used. Since the torso models were 
derived from an SSM [179], the location of the electrodes could be consistently 
defined on both models and no electrode placement variability is reflected in the 
simulated dataset. Future directions could include the consideration of further 
patient characteristics such as the age, sex, weight and accompanying pathologies 
for an improved classification result [318]. Furthermore, adding more simulated 
data covering a larger torso variability and electrode placement variations to the 
training dataset could also yield improved results. 

Even though the latest recommendations for diagnosing LAE were complied 
with by assessing LAV indexed to the BSA for each model combination to define 
the ground truth labels for the virtual patient cohort, these values are determined 
with approximation formulas in clinical practice and are therefore prone to errors. 
By including a multiplication factor in the analytical LAV calculation to account 
for the systematic underestimation of the atrial volume with 2D echocardiography 
compared to MR measurements, it was intended to correct for the inaccuracies and 
establish comparability to the clinical reference volumes. However, this correction 
factor itself and also the calculation of the BSA might still not lead to the exact 
same values that would have been set in clinical practice. Thus, even though the 
ground truth labels in the simulation dataset are analytically correct, they could 
have varied if the volume indexed to the BSA was calculated with the clinical 
approximation functions based on height, weight and gender of the patient. 
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95 Conclusion 


In conclusion, it was shown that the application of a bi-atrial SSM as a basis for 
generating synthetic ECGs can help to overcome the main drawbacks of clinically 
recorded signals for an automated classification of healthy and LAE ECGs with 
machine learning techniques: Firstly, an arbitrary number of atrial geometries can 
be derived leading in combination with functional electrophysiological variability 
to synthetic ECG signals representing a large virtual patient cohort. Secondly, 
the distribution of the synthetic ECGs in the different pathology classes can be 
freely chosen by drawing the eigenvector coefficients of the shape model from 
a custom statistical distribution so that left atrial volumes in the virtual cohort 
are uniformly distributed. Thirdly, reliable ground truth labels can be assigned 
to the simulated signals since diagnostic values for left atrial volumes and body 
surface areas are analytically calculable for the atrial and thoracic finite element 
models. These advantages finally contribute to an improved classification result 
of clinical healthy and left atrial enlargement ECGs when simulated data is added 
to the training set. 
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FINAL REMARKS 


Chapter ] 0 


Outlook 


10.1 Application and Validation of Multiscale 
Populations of Atrial Models 


Ideas for future projects involve applying an extensive population of atrial models 
for in silico clinical trials or for atrial fibrillation (AF) vulnerability studies. 
For this purpose, the model cohort presented in chapter 6 could be extended 
to systematically study differences in disease and treatment response among 
different cohorts, for example male vs. female subjects, or different age groups. 
Subsequently, vulnerability studies could reveal arrhythmogenic propensity of 
different subpopulations and unravel key model characteristics responsible for 
initiating and maintaining AF not only on a patient-specific level, but for entire 
subgroups of patients. Finally, the efficacy of different therapy options can be 
studied to investigate whether specific treatment strategies are particularly suitable 
for a specific subpopulation. 


10.1.1 Extension of the Model Population 


The construction of separate atrial statistical shape models (SSMs) representing for 
example healthy and AF patients or male and female subjects can provide a means 
to particularly derive instances known to reflect specific shape characteristics of 
the respective group. Furthermore, mapping a scalar field showing the probability 
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of low-voltage being present in different atrial regions across a clinical population 
onto the SSM as proposed by Nairn et al. [84, 304], could allow for directly 
generating endocardial instances with local arrhythmogenic substrate areas to be 
remodeled in the course of the simulations. To further enlarge the cohort, different 
torso shapes and rotation angles of the atria could be included into the population 
of models as explained in chapter 6. Thus, various cohorts of multiscale atrial 
computational models representing a wider range of variability can be derived, 
calibrated and grouped into different subpopulations based on various properties, 
as for example AF history, anatomical characteristics, age or gender. 

Age is one of the major risk markers for AF development [319], which is 
why an independent analysis of AF inducibility for an advanced age group could 
provide mechanistic insights into the driving forces for AF in affected subjects. 
Likewise, AF incidence rates are reported to be higher in men than in women [320]. 
Predominantly, more round left atrial shapes are observed in men, potentially 
representing one of the causes for increased AF onsets in male patients [318]. A 
separate shape analysis for male and female AF patients could therefore contribute 
to systematically investigate the impact of gender-related atrial shape variability 
on the onset and termination of AF. Also for the cellular model population, a 
gender-specific analysis regarding their risk of AF development can be carried out 
as done in previous work [321]. Ionic parameters for the cell model population 
can be chosen to capture female and male electrophysiology, so that a sex-specific 
evaluation regarding their AF vulnerability can be carried out. The feasibility 
of generating atrial instances randomly (see chapter 6 and chapter 8) and with 
systematic constraints regarding atrial volumes (see chapter 9) as demonstrated 
in this thesis can be built on for generating and calibrating different multiscale 
model populations. 


10.1.2 Vulnerability Study 


By applying an arrhythmia inducibility protocol, such as the one proposed by 
Azzolin et al. [48] for example, AF vulnerability for each model in all subgroups 
can be studied systematically and comprehensively. Arrhythmia propensity can 
be quantified by the number of pacing sites from which reentry patterns could 
be induced, the dynamics of the induced episodes (duration and complexity) 
and the areas in which arrhythmia drivers anchor. The results of the arrhythmia 
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vulnerability study can be evaluated to investigate whether there are specific 
suberoups that have an increased pro-arrhythmic risk compared to others. Work 
by Vagos et al. [256] suggests that ionic remodeling parameters are independent 
predictors for AF propensity. Furthermore, Jia et al. [65] showed that left atrial 
shape is associated with AF susceptibility. Both of these properties provide the 
necessary requirements for an arrhythmia to be sustained, namely a reduced 
conduction velocity, a shortened refractory period and an increased path length. 
With a population of multiscale models comprising the relevant variability, a fully 
integrated approach becomes feasible. Screening cohorts characterized both by 
different anatomical and functional properties systematically allows for identifying 
key remodeling parameters driving the arrhythmia and potentially dominate other 
factors. Subgroups more prone to arrhythmia development could be identified 
e.g. by means of fuzzy C-means or by Density-Based Spatial Clustering of 
Applications with Noise (DBSCAN) methods based on the arrhythmia propensity 
quantification results. Further methods to divide the model cohorts into subgroups 
characterized by their AF vulnerability could include decision trees as applied by 
Jia et al. [65] or sensitivity analysis as done by Vagos et al. [256]. Furthermore, 
electrocardiograms (ECGs) of the atrial normal sinus rhythm (P waves) and re- 
entry activity (f waves and F waves for fibrillation and flutter, respectively) can 
be computed for all instances in the model cohort. P waves could then serve as 
an input to machine learning techniques, taking either the raw ECG signals or 
selected features as an input. The algorithms could be trained to estimate the 
number of inducing points and other arrhythmia markers found for the respective 
model in the course of the vulnerability study as a surrogate for the patient’s 
individual risk to develop new-onset AF. 


10.1.3 Treatment Efficacy Assessment 


Based on the reentry simulations, the two major rhythm control options for treating 
AF can be applied. On the one hand, the efficacy of different pharamacological 
ion channel blockers and their doses [60, 322-324] for terminating reentry in all 
models can be investigated. On the other hand, it can be examined if there are 
systematic rules to place ablation lesions for different subgroups in the cohort to 
successfully terminate the arrhythmia and prevent recurrence. Ablation strategies 
to be tested could include pulmonary vein isolation or ablation of fibrotic areas 
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with and without connection to non-conducting anatomical structures and a com- 
bination of the above. In this way, systematic rules for ablation target selection 
depending on anatomical and functional characteristics of the patients represented 
by the models could be retrieved. Furthermore, it can be Investigated if the optimal 
ablation lines and drug doses can be identified based on the f waves computed 
for each individual model using similar machine learning strategies as described 
above. In this way, a non-invasive prediction of the optimal treatment strategy 
could precede catheter ablation and reduce procedure times. 


10.2 ECG-based Differential Diagnosis of Atrial 
Fibrillation Related Pathologies 


The ECGs generated in the studies and described in this thesis are representative 
for a healthy cohort (see chapter 6 and chapter 8) and patients with arrhythmogenic 
fibrotic atrial cardiomyopathy (FAM) (see chapter 8), left atrial enlargement (LAE) 
(see chapter 9) and interatrial conduction block (IAB) (see chapter 7). The changes 
in ECG characteristics between the healthy control group and each of the cohorts 
representing the above mentioned atrial pathologies, have been examined in detail 
and could individually be discriminated from the healthy signals using machine 
learning techniques. However, all of the examined atrial pathologies reflect partly 
in similar ECG changes, as for example P wave duration is increased in patients of 
all three above listed diseases. Thus, future projects could focus on investigating 
whether a differential diagnosis of healthy, FAM, LAE and IAB ECGs is feasible. 
In this way, risk stratification of AF can be performed more comprehensively and 
unveil the specific remodeling aspect driving the arrhythmia, whether it being 
structural (in case of FAM or IAB) or anatomical (in case of LAE). Thus, therapy 
planning can be further optimized and procedure times can be notably reduced by 
pointing towards the underlying arrhythmogenic mechanism prior to the actual 
intervention. For this purpose, the P wave databases generated in chapter 8, 
chapter 9 and by Bender et al. [260] could be employed initially to compute atrial 
ECG biomarkers without a necessary preprocessing step to delimit the single 
waveforms of the 12-lead ECG affecting the accuracy of extracted P waves and 
features. Following, the 12-lead ECG curves described in chapter 7 could be 
used to train the machine learning algorithm directly to investigate if a differential 


202 


10.2. ECG-based Differential Diagnosis of Atrial Fibrillation Related Pathologies 


diagnosis can be made automatically without extensive feature extraction steps 
preceding the development of the machine learning model. Also in this regard, 
the explicit benefit of simulated data in this multi-class classification task could 
be assessed by quantifying the improvement or decline in prediction accuracy in 
case the model is trained only on clinical data or on a hybrid ECG dataset. 
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Conclusion 


In summary, the main focus of this thesis was the large- and multiscale genera- 
tion of atrial computational models for electrophysiological simulations leading 
ultimately to P waves of the 12-lead electrocardiogram (ECG). These are for 
example applicable as an extension to clinical signals in the machine learning 
context. To accelerate the generation of large synthetic P wave datasets, simplified 
propagation models and forward calculation methods were examined to later on 
choose the most appropriate simulation approach ensuring physiological accuracy 
of the resulting signals, yet at reasonable computational cost. To overcome the 
lack of anatomical variability in recent atrial cohort simulation studies, a bi-atrial 
statistical shape model (SSM) was developed and evaluated. Based thereon, a 
multiscale population of atrial computer models comprising not only anatomical 
variability, but also functional variability on both tissue and cellular scale, was 
compiled and calibrated. Two large-scale simulation studies were performed to 
investigate the added value of simulated ECG data enriching clinically recorded 
signals for classification and regression tasks with machine learning algorithms. 
These served the purpose to answer the overall research question underlying this 
thesis (see section 1.3), namely, whether the performance of a machine learning 
model for atrial fibrillation (AF) risk stratification trained on a hybrid dataset 
leads to improved results compared to training on clinical data only. The former 
training scenario yielded improved results for the ECG-based detection of both 
left atrial enlargement (LAE) and arrhythmogenic fibrotic atrial cardiomyopathy 
(FAM), with details outlined below. In this regard, it was investigated whether the 
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inherent properties of electrophysiological simulations comprising the generation 
of large, balanced and reliably annotated data samples, can counteract the typical 
shortcomings prevailing in clinical datasets. Hence, two computational model 
cohorts were employed to generate P waves characterized by the presence of 
fibrotic substrate in the atria as well as by an enlarged left atrial volume. Both 
were used for the training of a machine learning model to predict the underlying 
structural and anatomical alterations in the atria which are predisposing factors for 
the development of AF. The main conclusion drawn from the studies conducted 
within the scope of this thesis and summarized above are as follows. 


For simulations of normal sinus rhythm in atrial electrophysiology with and 
without the inclusion of fibrotic tissue, the Eikonal model and the boundary 
element method (BEM) can be employed to simulate P waves of a comparable 
accuracy to the gold standard simulation approaches at a reduced computa- 
tional cost. 

Independent on the methodology to model fibrotic tissue, the Eikonal model 
combined with the BEM yielded P waves exhibiting correlation coefficients above 
0.9 to the gold standard full-blown bidomain simulation. Pseudo-ECGs generated 
with the infinite volume conductor method failed to accurately reproduce P waves 
in the precordial leads and led to amplitude overshoot mainly visible in lead 
V1-V3. Only in simulation scenarios where repolarization dynamics are of sig- 
nificant importance, e.g., for reentry simulations, the diffusion term in simplified 
propagation models can no longer be neglected as this assumption resulted in 
action potential durations notably deviating from the bidomain results. 


A bi-atrial SSM can serve as a basis for including atrial anatomical variability 
in populations of computer models contributing to represent variability in 
ECG biomarkers and P wave morphology occurring also in large clinical 
cohorts. 

The atrial SSM was built using automatically found corresponding landmarks on 
47 individual atrial image segmentations. It allows for deriving various endocar- 
dial surfaces that can be augmented to simulation-ready volumetric geometries 
representing atrial anatomy in a population. Atrial anatomy was shown to be a 
main contributing factor to varying P wave morphology in ECG simulations of 
virtual cohorts. 
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If properly processed, simulated data can contribute to improve the pre- 
diction of fibrotic tissue volume fractions on clinical ECGs when provided as 
additional input to a shallow feature-based feedforward neural network. 
The fraction of left atrial fibrotic tissue can be estimated based on extracted P wave 
biomarkers from clinical signals with an absolute error of 17.5 % fibrotic volume 
by a shallow neural network. When adding simulated P wave features to the 
clinical training set, the estimation error can be reduced to 16.5 % fibrotic volume. 
In this way, adding features of simulated signals during training counteracted the 
disadvantages of small and imbalanced clinical datasets. It was also found that 
the degree of pre-processing the simulated signals, i.e. the addition of noise and 
the chosen cutoff frequency of a low-pass filter, was decisive for the quality of 
the network outcome. This is because the addition of noise and a high cutoff 
frequency for the lowpass-pass filter strike a balance in closing the domain gap 
to clinical signals and preserving subtle ECG characteristics attributable to the 
presence of fibrotic substrate in the atria. 


Key P wave biomarkers for estimating the volume fraction of left atrial 
fibrosis comprise mainly features relying on the detection of the P wave off- 
set. This indicates that robust feature extraction methods are required for 
delineating the P wave ending in clinical signals. 

When adding noise directly to the extracted P wave features in the simulated 
database, the prediction accuracy of the neural network was impaired the most 
for biomarkers whose calculation was based on the P wave offset, such as P wave 
duration and the root mean squared voltage in the terminal signal parts of the 
P wave in individual leads. Thus, the delineation of P waves, especially the 
detection of the P wave ending in clinical ECGs, requires robust, sophisticated 
and reliable methods to ensure an accurate calculation of P wave biomarkers for 
training the network. 


A deep learning network tailored at detecting LAE from clinical ECGs bene- 
fits from being provided simulated signals as an additional input data source 
during training. In contrast, adding further clinical data during training of 
which pathology labels were annotated with less than 100% certainty was 
detrimental to network performance. The accuracy of a long short-term mem- 
ory (LSTM) network for performing the binary classification task between clinical 
healthy and LAE ECGs quantified to 0.95 when trained on a hybrid dataset, to 0.91 
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when trained on clinical data with 100 % label certainty, and to 0.83 when trained 
on clinical data with all label certainties. Thus, adding simulated data for the 
training process of the classifier actually contributed to an improved performance, 
potentially due to the added signal variability of a more heterogeneous and diverse 
simulated dataset the network can be trained on. In contrast, extending the clinical 
dataset with further clinical ECGs of unreliably annotated ground truth labels 
was rather confounding and led to a declined classification outcome. Ultimately, 
this highlights the benefit of accurately known ground truth pathology labels 
for simulated signals definable by adjusting parameter settings in the underlying 
simulation run. 
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P waves and Amplitude Features 
in Fibrosis Remodeling Scenarios 


The following figures complement the results for P waves and peak-to-peak ampli- 
tude features computed with different propagation drivers and forward calculation 
methods in the remaining simulation scenarios not included in section 4.3. 
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(b) Peak-to-peak amplitude features 


Figure A.1: ECGs (P waves) calculated with the same forward calculation method (BEM) but 
different propagation models (color coded) with the transmembrane voltages resulting from 
the simulation scenario with fibrosis modeled as slow conducting tissue. Peak-to-peak ampli- 


tude features extracted from the ECGs calculated with BEM and the respective propagation 
driver are represented with the colored triangle. 
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(b) Peak-to-peak amplitude features 


Figure A.2: ECGs (P waves) calculated with the same forward calculation method (BEM) but 
different propagation models (color coded) with the transmembrane voltages resulting from 
the simulation scenario with fibrosis modeled as slow conducting tissue. Peak-to-peak ampli- 


tude features extracted from the ECGs calculated with BEM and the respective propagation 
driver are represented with the colored triangle. 
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(b) Peak-to-peak amplitude features 


Figure A.3: ECGs (P waves) calculated with the same forward calculation method (BEM) but 
different propagation models (color coded) with the transmembrane voltages resulting from 
the simulation scenario with fibrosis modeled as slow conducting tissue. Peak-to-peak ampli- 
tude features extracted from the ECGs calculated with BEM and the respective propagation 
driver are represented with the colored triangle. 
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(b) Peak-to-peak amplitude features 


Figure A.4: ECGs (P waves) calculated with the same propagation driver (bidomain) but differ- 
ent forward calculation method for the healthy control simulation scenario. ECGs calculated 
with FEM, BEM and IVC results are represented by the solid, dashed and dottes lines, respec- 
tively. Peak-to-peak amplitude features extracted from the ECGs calculated with FEM, BEM 
and IVC are represented with square, triangle and circle markers, respectively. 
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(b) Peak-to-peak amplitude features 


Figure A.5: ECGs (P waves) calculated with the same propagation driver (bidomain) but dif- 
ferent forward calculation method for the simulation scenario with fibrosis modeled as ionic 
conductance rescaling. ECGs calculated with FEM, BEM and IVC results are represented by 
the solid, dashed and dottes lines, respectively. Peak-to-peak amplitude features extracted 


from the ECGs calculated with FEM, BEM and IVC are represented with square, triangle and 
circle markers, respectively. 
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(b) Peak-to-peak amplitude features 


Figure A.6: ECGs (P waves) calculated with the same propagation driver (bidomain) but differ- 
ent forward calculation method for the simulation scenario with fibrosis modeled as passive 
conduction barriers. ECGs calculated with FEM, BEM and IVC results are represented by the 
solid, dashed and dottes lines, respectively. Peak-to-peak amplitude features extracted from 
the ECGs calculated with FEM, BEM and IVC are represented with square, triangle and circle 
markers, respectively. 
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Ventricular Pathologies in 
Large-scale ECG Dataset 


Figure B.1 exemplarily shows lead II of one simulated and synthesized 10s 
electrocardiogram (ECG) belonging to the healthy control, the right bundle branch 
block (RBBB), left bundle branch block (LBBB), atrioventricular block (AVB) 
and the myocardial infarction (MI) pathology class. 
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Figure B.1: Exemplary simulated and synthesized ECGs for ventricular pathologies and the 
healthy control. 
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Figure B.2 shows the distributions of characteristic features extracted from 
ECGs of the RBBB, the LBBB, the AVB and the MI simulated and clinical cohort. 
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Figure B.2: Feature distributions compared between the ventricular pathology classes and 


the healthy control. 
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